Join the Memex Private Research Preview

2024-06-27 5 min read By Memex Team

newsletter

Greetings, friends of Memex!

First off—thank you for your feedback as we’ve been iterating. If you’re receiving this e-mail it’s because you’ve played a role in our development and journey over the past several months. We plan to start sending these newsletters on a regular basis to keep you apprised of our progress and thinking. We hope you enjoy (and if you don’t, you can of course unsubscribe)!

The big news in this first issue: We started our Private Research Preview last Friday!

This PRP does not yet meet our bar for a product we’d put in “the wild”. But we’ve matured our product direction to something we have conviction in, and this PRP is designed to help us further sharpen focus with the help of your usage and feedback.

If you’d like to get access to it and haven’t yet, you can get the Memex Mac App at this link after answering a few questions.
- API credits are on us—DM us on Discord for $50 in credits
- You can also use your own OpenAI or Azure keys if you prefer
If you know someone that would be interested in Memex, feel free to forward this email to them. We are still in Private Preview, so please avoid posting about it publicly on social media.
Join our Discord to speak with us and others about Memex and broader industry trends!

What we’re about

79 years ago this month, Dr. Vannevar Bush published his seminal essay “As we may think,” where he introduced the memex:

“Consider a future device for individual use, which is a sort of mechanized private file and library. It needs a name, and, to coin one at random, “memex” will do. A memex is a device in which an individual stores all [their] books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to [their] memory.”

The memex is the namesake of our company—not because we are building a mechanized private file, but because of the ethos and vision introduced in Dr. Bush’s essay. He envisioned a world in which more productive scientific professionals used radical new ways of working to advance humanity’s collective knowledge. He had high expectations for the future, for human ingenuity, and for scientific and technological progress. And he coupled his long term outlook with practical experimentation and design to make tangible progress toward his vision.

We have big shoes to fill. Whereas Dr. Bush’s vision was for a device to extend human memory, our vision is to augment and parallelize human cognitive faculties to advance scientific and technological progress.

What is Memex?

Our Memex is a synthetic engineering companion. It runs as a local app on your computer, and you can interact with it via a chat interface. Memex connects LLMs with our agent-computer interfaces (ACIs) that allow them to:

install and use command-line tools
write and execute code
run applications
access the internet

Memex can help with common scenarios, like bootstrapping a project, resolving dependency conflicts, dockerizing repos, attempting to reproduce research from a github repo, setting up an experiment, and creating + deploying one-off apps.

Check out what people are doing with Memex in the Discord.

Showcase

We’re excited by the early usage of the Private Research Preview. Early users are doing cool things with it, finding bugs, and probing the boundaries of its capabilities. We are bullish on the potential of AI systems to augment scientific and engineering professionals. But we also have a “hype allergy”—our goal is to take a balanced view of what AI + Memex can do. We use AI every day, and we’re impressed by its amazing successes and failures. So for each showcase item, we aim to include the relevant caveats when appropriate.

1. Train a transformer in a jupyter notebook locally

In this example, Memex . . .

created a poetry environment
installed jupyter
created a “hello, world” notebook and tested if it worked
initialized git with a .gitignore and committed the work
created a simple transformer training notebook, resolved the dependencies, then verified it worked properly

The above GIF was made after those steps were completed.

Caveats: no specific caveats—the whole example cost < $1.00, and it ran as expected. It encountered a couple of dependency issues that it resolved, which is expected behavior. That said, I followed practices outlined in our Agent Engineering 101. A task like this could be quite failure prone if the agent attempted to do it all at once instead of step-by-step.

2. New feature for a web project

A friend has an OSS project to track the lineage of foundation models: Unified Model Record. They wanted to add a dynamic visualization of the lineage of each foundation model, such as:

What data was it trained on?
Which models synthesized-data that it was trained on?
Which models was it distilled from?

Memex was able to implement the feature.

The mind-blowing aspect of this is that I have never worked with either Graphviz or Pelican before—two frameworks the project depends on. If I built this feature “manually,” I likely would have written some python script outside of the Pelican framework. But LLMs know frameworks like Pelican from their pre-training data, and gpt-4o was able to recognize this feature could be elegantly implemented as a Pelican “plugin.” My friend even called out the clever approach:

The entire feature, including multiple file edits and the pull request, were done in a single conversation with Memex—no terminal / IDE / etc. The pull request here shows everything Memex did.

Caveats: The agent got derailed several times. I had to do significant agent-error handling to accomplish everything in a single conversation. And the conversation with Memex cost a whopping $65 with gpt-4o 🤯. There were plenty of things it struggled with, including:

installing conda
editing files containing multi-line strings

We pushed fixes for both of these issues. And we put our heads together on the cost dimension and think we have a path to >10x cost reduction that we’ve got in our queue.

What we’re reading

The Returns to Science In the Presence of Technological Risks: This research by Matt Clancy examines the potential return on investment of improving scientific institutions, suggesting that making science 10% more effective could generate an additional $2.5 trillion in social value annually. He also explores the risks and benefits of accelerating scientific progress, considering scenarios where rapid advancements could lead to both significant societal gains and existential risks, such as those from advanced biotechnology. The podcast about it is great!
[2406.07016] Delving into ChatGPT usage in academic writing through excess vocabulary This paper investigates the impact of large language models (LLMs), like ChatGPT, on academic writing by analyzing excess vocabulary in 14 million PubMed abstracts from 2010 to 2024. The study found a significant increase in the frequency of stylistic words associated with LLMs, estimating that at least 10% of 2024 abstracts were processed with such models, with some fields and countries showing even higher usage.
GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models They consider whether generative pre-trained transformers (GPTs) are general-purpose technologies (GPTs) in the economics sense. Other GPTs are the steam engine, electricity, the combustion engine, and semiconductors. We’re particularly excited about their findings re: science and engineering:

That’s it for our first newsletter. Hope you enjoyed! Please share any feedback, whether positive or critical!

Ever forward,

The Memex Team