Reflections on PalantirFeb 12
a former palantir engineer on how the companyâs culture of relentless execution led to a $230 billion market cap
Nabeel QureshiSubscribe to The Industry
Sarah Constantin is an AI and science researcher. Early in her career, she was a deployed computational engineer at Palantir, and she later founded a company to research human lifespan extension technologies. Sheâs now a program officer at Renaissance Philanthropy funding AI tools for advanced math research. Today in Pirate Wires, Constantin imagines six creative, genuinely useful applications for LLMs that go far beyond âa chatbot, but better.â
This piece was originally published in Constantinâs Substack, Rough Diamonds.
--
Iâm convinced that people who are interested in large language models (LLMs) are overwhelmingly focused on general-purpose âperformanceâ at the expense of exploring useful (or fun) applications.
As I work on a personal project, Iâve been learning my way around HuggingFace, a hosting platform, set of libraries, and almost-social-network for the open-source AI community. Itâs fascinating, and worth exploring even if youâre not going to be developing foundation models from scratch yourself. If you simply want to use the latest models, build apps around them, or adapt them slightly to your own purposes, HuggingFace seems like the clear place to go.
You can look at trending models, and trending public âspaces,â (aka cloud-hosted instances of models that users can test) to get a sense of where the âenergyâ is. And what I notice is that almost all the âenergyâ in LLMs is on general-purpose models, competing on general-purpose question-answering benchmarks, sometimes specialized to particular languages, or to math or coding.
âHow can I get something that behaves basically like ChatGPT or Claude or Gemini, but gets fewer things wrong, and ideally requires less computing power and gets the answer faster?â is an important question, but itâs far from the only interesting one!
If I really search, I can find âinterestingâ specialized applications like âpredicts a writerâs OCEAN personality scores based on a text sampleâ or âuses abliteration to produce a wholly uncensored chatbot that will indeed tell you how to make a pipe bomb,â but mostly⌠itâs general-purpose models. Not applications for specific uses I might actually try.
Some applications seem eager to go to the most creepy and inhumane use cases. No, I donât want little kids talking to a chatbot toy. No, I donât want to talk to a chatbot on a necklace or pair of glasses. (In public? Imagine the noise pollution!) No, I certainly donât want a bot writing emails for me!
Even one of the apps I found potentially cool â an AI diary that analyzes your writing and gives you personalized advice â ended up being so preachy that I canceled my subscription.
In the short term, of course, the most economically valuable thing to do with LLMs is duplicating human labor, so it makes sense that the priority application is autogenerated code. But the most creative and interesting potential applications go beyond âdoing things humans can already do, but cheaperâ to do things humans canât do at all on a comparable scale.
To some extent, social media, search, and recommendation engines were supposed to enable us to get the âcontentâ we want. And, to the extent thatâs turned out to be a disappointment, people complain that getting exactly what you want is counterproductive. We end up with filter bubbles, superstimuli, and the like.
But I find that we actually have incredibly crude tools for getting what we want.
We can follow or unfollow, block or mute people; we can upvote and downvote pieces of content and hope âthe algorithmâ feeds us similar results; we can mute particular words or tags.
What we canât do, yet, is define a âquality,â âgenre,â or âvibeâ weâre looking for and filter by that criterion. The old tagging systems (on Tumblr or AO3 or Delicious, or back when hashtags were used unironically on Twitter) were the closest approximation to customizable selectivity, and theyâre still pretty crude.
We can do a lot better now.
This is a browser extension.
You teach the LLM, by highlighting and saving examples, what you consider âunwantedâ content that youâd prefer not to see. The model learns a classifier to sort all text in your browser into âwantedâ vs. âunwanted,â and shows you only the âwantedâ text, leaving everything else blank.
Unlike muting/blocking particular people (who may produce a mix of objectionable and unobjectionable content) or muting particular words or phrases (which are vulnerable to context confusions, e.g. if you want to mute âwokeâ in a political sense but not âI woke up this morningâ), you can teach your own personal machine a gestalt of the sort of thing youâd prefer not to see, and adjust it to taste. And, you donât need to trust a third-party moderator to decide for you.
You would, of course, be able to make multiple filters and toggle between them if you wanted to âsee the worldâ differently at different times.
Youâd be able to share your filters. Some would probably become popular and widely used, the way Lists on Twitter/X and a few simple browser extensions like Shinigami Eyes are now.
This is also a browser extension.
In addition to hiding unwanted text, you could make a more general type of text classification by labeling text according to user-defined, model-trained classification.
For instance:
I expect itâs more difficult, but it may be possible for the LLM to infer characteristics pertaining to the quality/validity of discussion:
This would display information about what kind of text we are reading, which we can certainly detect on our own though it can sneak up on us unnoticed. A âcognitive prostheticâ like color-coded text could be helpful for maintaining perspective or prioritizing: âOh hey, Iâve been reading angry stuff all day, no wonder Iâm getting angry myselfâ or âlet me read the stuff highlighted as high-quality first.â
This could be an app.
Youâd give it a set of resources (blog, forum, social media feed, etc.) that you donât want to actually read, and assign it to give you a digest of facts (news-style, who/what/when/where concrete details) that come up in those sources.
For instance, back in January 2020, early online discussion of COVID-19 often took place on sites like 4chan where racially offensive language is common. To learn there was a new deadly epidemic in China, youâd have to expose yourself to a lot of content most people would rather not see.
It should be well within the capacity of modern LLMs to filter out jokes, rhetoric, and opinionated commentary, isolating ânewsworthyâ claims of fact and presenting them relatively neutrally.
I donât love LLM applications for âtext summarizationâ because I usually worry about the auto-summary missing something important about the original document. Lots of these summarization tools seem geared for people who donât actually like to read â otherwise, why not just read the original? But summarization could become useful if itâs more like trawling for notable âsignalâ in very noisy (or aversive) text.
This is a browser extension that would translate everything into plain language, or language at a lower reading level. The equivalent of Simple English Wikipedia, but autogenerated and for everything.
I donât find that current commercial LLMs are actually very good at this! Iâm not sure how much additional engineering work would be necessary to make this work â but it might literally save lives.
People with limited literacy or cognitive disabilities can find themselves in terrible situations when they canât understand documents. Simplifying bureaucratic or official language so more people can understand it would be a massive public service.
For better or for worse, people end up using LLMs as oracles. If youâre counting on the LLM to give you definitely correct advice or answers, thatâs foolish. But if you merely want it to be about as good as asking your friends or doing a five-minute Google search, it can be fine.
What makes an LLM special is that it combines a store of information, a natural language user interface, and a random number generator.
If youâre indecisive and you literally just need to pick an option, a simple coin flip will do; but if you feel like it might be important to incorporate personalized context about your situation, you can just dump the text into the LLM and trust that âsomehowâ itâll take that into account.
The key âoracularâ function is not that the LLM needs to be definitely right but that it needs to be a neutral or impersonal source, like a dice roll or a pattern of bone cracks. Two parties can commit to abiding by âwhatever the oracle saysâ even if the oracle is in no way âintelligentâ â but intelligence is certainly a bonus.
This works best as an app.
Itâs inspired by r/AmITheAssholeâs model: given an interpersonal conflict, whoâs the âassholeâ (rude, unethical, unreasonable, etc.)? Itâs possible for multiple parties to be âassholes,â or for nobody to be.
The mechanism:
Of course, this is not enforceable; nobody has to take the LLMâs advice. But nobody has to take a coupleâs therapistâs advice either, and people still go to them.
A neutral third party who can weigh in on everyday disputes is incredibly valuable â this is what clergy often wound up doing in more religious societies â and we lack accessible, secular, private means to do this today.
This is an LLM-powered bot you can include in group chats (e.g. Discord, Slack, etc.)
The bot is trained to detect conversational dynamics:
What could you do with this sort of information? Potentially:
Some implementations would be very similar to human moderation, but probably more nuanced than any existing auto-moderation system; other implementations would be unsettling but potentially illuminating social experiments. They might help people gain insight into how they show up socially.
The option to ask the bot to âweigh in,â like Hey bot, did Alice avoid answering my question right there? can build common knowledge about conversational âtacticsâ that are often left plausibly deniable. Plausible deniability isnât necessarily a bad thing, but at its worst it enables gaslighting. A bot that can serve as a âthird partyâ in even a private conversation, if all parties can trust it not to have a preexisting bias, can be a sort of recourse for hey, itâs not just my imagination, right? something shady just happened there.
All of our mechanisms for managing digital communication were invented before we had advanced tools for analyzing and generating natural language. So many of the technologies we now think of as failures may be worth revisiting â they may prove more tractable now that LLMs exist.
As I remember, during the rise of âWeb 2.0â back in the late 2000s and early 2010s, we were continually learning new behavioral patterns enabled by digital tools. We each experienced a first time for ordering delivery food online, or ordering a taxi online, or filling out a personality quiz, or posting on a social media site or forum, or making a video call.
And then, for a while, those âfirstsâ stagnated. All the basic âtypes of things you could doâ on the internet were the same ones youâd been doing five or ten years ago.
I donât think thatâs fundamentally a technological stagnation. Itâs not really that weâd reached the limit of what can be done with CRUD (Create, Read, Update, Delete) apps. Instead, it may have been a cultural consolidation â the center of gravity moved to bigger tech companies, and eyeballs moved to a handful of social media sites.
Now, I have a sense that LLMs might be able to restart the whole âwhat could I do with a computer?â discussion. Some of our answers to that question will rely on new AI capabilities; others will feel familiar because theyâre things we could have done before LLMs, but it didnât occur to anyone to try.
What if, for instance, an LLM âdecidedâ how to match dating profiles for compatibility?
Well, you could have done that in 2010 with a dot product between multiple-choice questionnaire responses, and OkCupid did. But... shh, never mind. Because we want nice things, and we should appreciate pixie dust (even computationally expensive pixie dust) that makes nice things seem possible.
And the ability to work with language as a truly malleable medium allows quite a bit nicer things than the decade-ago version would. Many nice things are not fundamentally dependent on any future advances in technical capability. You can do them with what we have now, and maybe even with what we had yesterday.
âSarah Constantin
Subscribe to The Industry
0 free articles left