Creating another MCP server, but this one is for research

jspann1 pts0 comments

Creating another MCP server, but this one is for research - While in the lab

:first-child{mt:0!}<br>_:where(pre){p:20;_r:8;_overflow:auto}<br>_:where(pre,code:not(.highlight_*)){bg:fade-2;_bg:fade-92!@dark}<br>_:where(strong,b,a,code:not(.highlight_*),mark,del){font:fade-92;_font:fade-12@dark}<br>_:where(table){width:full;_border-spacing:0}<br>_:where(td){v:baseline}<br>_:where(td,th):first-child{pl:0}<br>_:where(td,th):last-child{pr:0}<br>_:where(td,th){bb:1;solid;fade-92/.06;_p:6;_b:fade-4/.04@dark}<br>_:where(th){font:fade-78;_font:14;_text:left;_font:fade-12@dark}<br>_:where(th,p_code,li_code,a,mark){font:semibold;_font:medium@dark}<br>_:where(ul){list-style-type:disc}<br>_:where(ul,ol,blockquote){pl:1.5em}<br>_:where(video,img){max-width:full}<br>_:where(a,mark){text-underline-offset:3}<br>_:where(hr){h:2;_bg:fade-10;_bg:fade-70@dark;_my:3em}<br>">In the weeks following my last blog post, I had a niggling feeling that I could apply an MCP server to my literature review. This post is my first run at exploring that.

During my grad degree, it has been one of the hardest aspects I&rsquo;ve had to learn: How do you take papers and distill them down to a supporting argument, a newfound gap, or evidence that something you want to build has a high chance of working? It&rsquo;s initially really time consumptive, and to my surprise, isn&rsquo;t a uniform process for labs and researchers. For me I have a massive Google sheet with a tab for each area I&rsquo;m investigating, a column for each aspect I want to explore in that area, and a row for each paper I&rsquo;ve found. With some starter papers in hand I&rsquo;ll review the papers it cites (backwards pass) and the papers that cite it (forward pass) and add new papers to the spreadsheet based on relevance to the area. Another student I know tries to follow something similar to a PRISMA approach by reviewing a conference at a time, screening papers, and including works in their manuscript based on eligibility. Another student frantically searches keywords on Google Scholar right before the deadline (this is ill-advised).

However, not everyone takes this approach. Last year I reviewed a paper where every single citation was fake. All of them. It was not only a waste of my time, but if I hadn&rsquo;t caught it, it could have been published and given false credibility to an idea that hadn&rsquo;t been proven. This is one of the reasons why the pre-print site ArXiv announced they&rsquo;re banning authors if they let LLMs generate the paper and ACL will desk reject papers with fake citations.

I think LLMs can have their place in the research process but the review of related work is what makes a work trustworthy and part of a solid foundation for others to use. There are some LLM tools that try to improve on the paper-finding process (Google Scholar Labs, Undermind.ai) but reproducibility of that search process still can be an issue and it can be unclear to see how a selected work fits into a broader scope of an area.

The jury is still out on how these models will be used in the future, but I was interested in how I could use one to make a tool to aid in my reviews.

Initial idea

I wanted to build a tool that helped me to make arguments but also was grounded in real research concepts and could be audited for correctness.

I started with my spreadsheet of papers I&rsquo;ve already reviewed and decided building an MCP server that could review through those made sense. This way I also don&rsquo;t have any real storage costs, since I could use the Google Docs API to &ldquo;host&rdquo; the detailed attributes of papers that I had already vetted. There&rsquo;s 16 active-ish sheets with about 30 papers each, but I started with just four sheets (i.e. research areas).

A screenshot of my literature review spreadsheet on Google Sheets

I used Claude to write a short Python script to create a basic MCP server that I could host on AWS Lambda (free tier!). The server had a function (technically called a tool) for each sheet and an additional prompt for a description of my prototype, my study goal, and my planned study procedure.

Once I wired the server into cursor, I gave it a go:

Generating summaries of methods based on prior work I&rsquo;ve reviewed

Genrating arguments based on methods in prior work

I like this because I know how it arrived at these claims and I know which papers I need to review if I want more details on them. Plus, if I make changes in my spreadsheet, it&rsquo;s instantly reflected in my server. That being said, it isn&rsquo;t perfect; The output is unstructured and some papers were repeated since I didn&rsquo;t have a broad enough scope of that sub-area, but I think I could solve that with an updated review.

My next step was a new sheet/tab that holds all the gids (indicies in Google sheets to reference each sheet in the file) associated with an area. This made it so I could auto-generate a function that I could call for getting the papers from each area. I also added search functions so in...

rsquo papers fade server review area

Related Articles