Practical Tools for Resisting the Control Regime
I made a Pyscript app that searches the WantToKnow news archive in a special way.
I often write about the importance of technological solutions to the problem of corporate and government interference with free speech. Censorship is rampant in search, news media, and social media. The website I work for, WantToKnow.info, is censored by google and all of the main social media platforms. Our site is awesome and it's getting better all of the time. But establishment interests clearly don't want people to read and talk about the kind of news we carefully document and archive.
Our archive itself contains about 13k news article summaries, mostly on high level corruption and cover-ups. Searching our site by keywords or category is easy and returns great results. Yet situations may arise when a user can describe what they're looking for but they don't know what keywords to search for. So I made a little Pyscript app that searches our archive based on TF-IDF vector cosine similarity.
Vector Search
This method captures keywords, but it also finds matches based on the mathematical similarities between a search query and the articles in the dataset, so people can search with only vague ideas and still get back relevant results. For the technically inclined, I've shared the process of making this app on my Hive blog. You can read about it and see code examples here, here, here, here, here, and here. To try out vector search for yourself, here it is in action.
Along with vector search, the first prototype of this page displays the most recent messages from a Telegram group I started called WantToKnow Unofficial Getting this done required me to interact with something called the Bot Father, which was a whole thing. The idea is that anyone using the app should have a place where they can connect with others who are looking at the same material. The page is coded in a way that allows people to do this even is they're just running it locally from a source file.
Realistically, vector search may not be worth integrating into our site's existing search. Even if this turns out to be the case, the project paved the way for us to begin integrating search by publication. It also provided useful stats about our dataset.
Decentralized and Censorship-Proof
Instead of a traditional tech stack, this app uses IPFS for the data storage layer and a telegram bot for message retrieval, while everything else is done with a single html file that's only 13.2kb. Using IPFS means that the data is immutable. Unlike standard server-hosted databases, it's technically impossible for a third party to change even a single byte of this data. And the ability to run the page locally from just one tiny file means that even the most paranoid privacy advocates can explore our archive without fearing that their identifying information will be stored by unknown third parties, which is common practice for news websites, though our site definitely doesn't do stuff like that.
The Telegram integration is a total experiment. My thinking is to try the group out and see if anyone uses it. Maybe no one will, but in theory thousands of people could join up today and it would scale just fine. Right now, Telegram is great for free speech and it's mostly uncensored, but that could easily change in the future. I'm not sure if the Telegram group will ultimately turn out to be useful, but it does create new space for our readers to discuss the topics we cover.
Adding components to an app like this is straightforward. One feature I'm still working on is OpenAI integration. The idea is to use gpt-4o to generate a brief summary of search results and display this on the page. Another feature I'm considering is crypto integration in support of mutual aid efforts. It would even be possible for me to revive my Rstory token to use in app in various ways.
These crypto ambitions may be lofty, yet ultimately I'm considering them for very practical reasons. Corporate and government interests routinely disrupt the finances of dissident groups. After the US government wiped out Wikileaks' financial infrastructure, the organization was only able to survive because it adopted bitcoin in 2011. More recently, the Canadian government froze bank accounts of activists opposed to vaccine mandates, and Paypal dropped indi publisher Mint Press News without warning.
In my opinion, one of society's most pressing needs is for media that the powers that be technically cannot control. Government and Big Tech are allies in manipulating public discourse, but we can defend ourselves against their efforts in a variety of ways. My app makes it easier for people to find reliable information on corruption and cover-ups, and makes it possible for the people finding this information to connect directly with each other. Someday soon, this app might include AI and crypto.
This won't change the world or anything, but it does showcase tech that has the potential to do just that. Data integrity. Decentralized social networks. Censorship resistance. Money beyond the reach of tyrants. All of these things can be realized today, with available tech. And more is becoming possible all the time.
For more of my writing, check out my scifi novels and my Hive blog.
Dope! Where's the telegram invite link?