Typescript x AI Meetup Talk: Privacy First AI Summarizing in TypeScript

On May 21, 2025 I was fortunate enough to present a TNG hacking project at the Munich Typescript Meetup. Thanks to the organizer Carl Assmann and TNG Technology Consulting GmbH for having me!

The Typescript x AI organizers and speakers: Luca Becker, Carl Assmann, Luisa Peter, Lucas L. Treffenstädt, Benjamin Behringer, Alexander Opalic, Johannes Loher
The Typescript x AI organizers and speakers: Luca Becker, Carl Assmann, Luisa Peter, Lucas L. Treffenstädt, Benjamin Behringer, Alexander Opalic, Johannes Loher

Talk Abstract

Nearly every AI-enabled product in 2025 has a summarizing function. But they all use 3rd party software to get the summary. This raises major privacy concerns and is prohibitive for most of us in the community. We display a small React/Typescript app that will enable you to summarize even longer text on your local machine or anywhere and a Large Language Model of your choosing. We motivate the app and show it in a live demo with local and remote LLMs. We discuss different approaches to prompting and the quality of summaries on various LLMs. We show the implementation and dive into the most interesting parts of the code.

Future Work

We are currently developing the app and plan to release a public version in the future. We will keep you posted on the TNG Twitter account, the Bored Consultant Twitter account or my LinkedIn Page.

Read More

LLM Performance on Consumer Hardware

LLMs are huge, slow to run and require terrabytes of RAM on the newest graphics cards, right? Well, maybe not. In this article we compare the performance of two popular current LLMs: llama 3.x and deepseek-r1 on a variety of consumer hardware from Laptop GPUs to dual Nvidia 5090s. We test quantized and full precision models and show which one can fit into the memory of your graphics card. We also test against Apples M chip series and explore what can be achieved with older but cheaper server hardware.

Test Setup

We test two of the currently popular LLM families, llama 3 and deepseek-r1, in quantized and 16bit floating point (fp16) versions for their performance on consumer hardware.
We run the models using ollama 0.6.5 and open-webui as frontend. We collect the measured response tokens after executing the following query:

I need a summary of the book “War and Peace¨. Please write at least 500 words.

For each test, the model fits into the memory of the graphics card (VRAM). Mobile devices are allowed to cool down before each test run.

Tested Models

In this article we focus on two of the currently popular LLMs:

  • Llama 3, developed by Meta, represents the latest evolution in open-source large language models, building upon its predecessors with enhanced capabilities, improved efficiency, and greater contextual understanding. It comes in various sizes, typically ranging from smaller, more efficient models suitable for edge computing and personal devices, to larger, more powerful variants designed for complex tasks and extensive data processing. Its versatility makes it ideal for applications such as chatbots, content generation, coding assistance, and research. For the AI community, Llama 3 signifies a significant step toward democratizing advanced AI technology, fostering innovation, and enabling broader access to sophisticated AI tools and research opportunities.
  • DeepSeek-R1, developed by DeepSeek AI, is a cutting-edge large language model designed specifically to excel in coding and technical tasks. It comes in multiple variants, including models optimized for general-purpose programming, debugging, and software development assistance, making it highly versatile for developers and tech professionals. Its primary uses include code generation, error detection, automated debugging, and providing detailed technical explanations. For the AI community, DeepSeek-R1 represents a significant advancement in specialized AI models, enhancing productivity and accuracy in software development, and contributing to the broader adoption of AI-driven coding solutions.
Model Variant Precision VRAM Size
llama3.2:1b instruct Q8_0 2.7GB
llama3.2:3b instruct Q4_K_M 4GB
llama3.1:8b instruct Q4_K_M 6.9GB
llama3.3:70b instruct Q4_K_M 49GB
llama3.2:1b instruct fp16 3.9GB
llama3.2:3b instruct fp16 8.5GB
llama3.1:8b instruct fp16 17GB
deepseek-r1:7b qwen distill Q4_K_M 6GB
deepseek-r1:8b llama distill Q4_K_M 6.9GB
deepseek-r1:14b qwen distill Q4_K_M 11GB
deepseek-r1:32b qwen distill Q4_K_M 25GB
deepseek-r1:70b qwen distill Q4_K_M 49GB
deepseek-r1:1.5b qwen distill fp16 4.2GB
deepseek-r1:7b qwen distill fp16 16GB
deepseek-r1:14b qwen distill fp16 32GB
Table 1: tested llms

Table 1 shows the tested LLMs. Ollama uses quantized models by default. If you specify a more detailed tag you can choose different quantizations as well as the non-quantized floating point versions. We select llama 3.x and deepseek-r1 for their popularity and start with the smallest versions parameter-wise. We continuously choose higher parameter models until they would not fit into any of the test systems VRAM.
The last column in the table shows the size of the model when loaded into VRAM as reported by ollama ps. We can already see here that not all models will fit into the VRAM of the test systems from Table 2.

Read More

The Rebuttal: AI's Impact On Content Creation And Knowledge Sharing

In our earlier post “Will AI Kill The Blogosphere?” we discussed the impact of AI on content creation in the internet and how to react to this change. Now we want to find out what the AI itself thinks of the subject and our previous article. We created a rebuttal using the following prompt:

Read the given article and write a response. Evaluate the premise and conclusion
in detail. Draw your own conclusion on the subject. Be honest and don´t hold
back. Try to provide value for the reader.

The following the Grok 3’s answer. Enjoy!

Response to the Article: AI’s Impact on Content Creation and Knowledge Sharing

The article “The Impact of AI on Content Creation and Knowledge Sharing” explores how artificial intelligence is reshaping the landscape of content creation and the dissemination of knowledge. Below, I’ll evaluate the article’s premise and conclusion in detail, offer my own perspective, and provide actionable insights for readers navigating this evolving terrain. Let’s dive in.

Read More

Will AI Kill The Blogosphere? And Why Your Company Could Profit From That

There is a whole business segment out there of people making money writing blogs. Either they earn money directly from ads, affiliate links, subscriptions or donations, or they grow the blog to a certain viewership and then sell it to someone who needs a platform to promote their product. There are marketplaces for this like Flippa and Motion Invest, to name two, and Youtube channels like Income School to teach you how to do it. And albeit the multiples for selling such a blog are relatively low at around 2-3, if you can grow the blog fast enough and maybe do more than one at a time, it’s still a profitable business.

At the same time, there are tens of thousands of users on Stack Overflow, answering the millions of new questions per year for free. These people are not getting paid but receive a different kind of reward: recognition. Apparently, a strong enough motivator.

But with the advent of AI and their liberal use of copyrighted material, or just plain piracy, both kinds of incentives might be threatened: If on one hand AI scrapes your content so people don´t need to visit your page to create clicks or recognize your name, and on the other hand the AI might create the kind of content you provide directly without needing your input, then why put in the time and effort to create content, especially when you depend on the ability to sell it for money? Consequently, there are various signs that content creation on Stackoverflow is dropping: 1, 2, 3, and also Blog sales are contracting.

Read More

Getting Started With Stable Diffusion 3.5 On Python

So you want to follow the hype and generate some images with Stablility AI`s shiny new Stable Diffusion 3.5 model (SD 3.5). You find the model on Hugging Face, and hey, there is a code example to try it out. And it works!?

Absolutely not!

Inconveniently there are a lot more steps to take and considerations to make before your python script will generate an AI image, especially on consumer hardware with little VRAM. This article shows how to really do it, even on a laptop GPU with only 6GB of VRAM. As such it is an adapted collection of other material available on the web.

So, let’s get you your:

Read More

Replacing Docker Desktop With Hyper-V

Docker Desktop just got more pricey again. Let’s explore some ways to replace at least part of its functionality like running docker containers and doing networking. This guide will be for the Windows operating system, as it is the one where users will most likely use Docker Desktop.

We will use the Hyper-V virtualization solution already present on Windows and show how to integrate your Docker Desktop replacement into your environment.

Read More

Tesla Model 3: 6 Month Owners Review

I bought me a Tesla Model 3 in August of 2023. I did it mostly because I was bored but also to take advantage of the government subsidies that were available at that time. Hey, if the government wants to spend my taxes, they should spend it on me.
Right of the bat: The car is good. It has its flaws like most cars but also a lot of nice features to make up for it. Some quirks also arise from the new EV technology and neither Tesla nor the car itself can do anything to fix it.
But there are strong opinions on EVs from both supporters or opposers of the technology. Here I want to share my observations on some of the most common prejudices and also share some learnings of my own.

Read More

Create VCards from Excel Sheets with Python

So they sent you an Excel sheet with a bunch of contacts and two days later your call history looks like you are taking part in a code decipher challenge? Maybe you should have converted them to contacts in your phone? Ok, so you tried but that didn’t work: Apple Addressbook fails silently, you cannot trust the web services with customer data and after paying for some apps on the internet you discover that some of them cannot even open Excel sheets without failing. And of course no one will type hundreds of contacts into their phone manually.

But there is another way: just automate it yourself. It’s remarkably simple using Python. In this article we demonstrate how to read an address list from an Excel sheet in xlsx format (Excel Standard since 2007) and output vCard 3.0 files one can import to their phone or EMail/PIM app of choice.

Read More

please-cli: Solving man Pages With AI

Large Language Models (LLMs) like GPT-4 are trained on vast datasets that include man pages, readmes, forum questions and discussions, source code and other sources of command line tool documentation. Given a set of requirements one can query a LLM to predict a command line that will perform the required task.

please-cli is a wrapper around GPT-4 that can help you translate your requirements into a shell command. Let’s start with an example:

benjamin@asterix:~# please convert a.jpeg to avif and upscale it to 200%
💡 Command:
convert a.jpeg -resize 200% a.avif

❗ What should I do? [use arrow keys or initials to navigate]
> [I] Invoke [C] Copy to clipboard [Q] Ask a question [A] Abort

Well, looks promising and the code actually works. please-cli also gives you some handy shortcuts to immediately invoke or copy the code. You can also inquire directly about the command.
In the following sections we will look at some other examples and wether we can find limitations of the script generation.

Read More

DAVx SOGo Sync Error Multi-Get Response Without Calendar Data

For weeks now I had a sync exception from my DAVx app, indicating a problem when syncing my CalDAV calendar from SOGo. As Thunderbird was not affected I managed to ignore the problem for quite some time. But then I noticed that some newer appointments are not synced anymore. So I had to investigate.

DAVx allows you to export debug information, where the error is shown as:

SYNCHRONIZATION INFO
Account: Account {name=benjamin@example.com, type=bitfire.at.davdroid}
Authority: com.android.calendar

EXCEPTION
at.bitfire.dav4jvm.exception.DavException: Received multi-get response without calendar data
at at.bitfire.davdroid.syncadapter.CalendarSyncManager$downloadRemote$1$1$onResponse$1.invoke(CalendarSyncManager.kt:9)
at at.bitfire.davdroid.syncadapter.CalendarSyncManager$downloadRemote$1$1$onResponse$1.invoke(CalendarSyncManager.kt:1)
at at.bitfire.davdroid.syncadapter.SyncManager.responseExceptionContext(SyncManager.kt:13)
at at.bitfire.davdroid.syncadapter.CalendarSyncManager$downloadRemote$1$1.onResponse(CalendarSyncManager.kt:18)
at at.bitfire.dav4jvm.Response$Companion.parse(Response.kt:308)
...

A few lines later the log file specifies the remote source as:

REMOTE RESOURCE
https://example.com/SOGo/dav/benjamin/Calendar/personal/event_293017175@meetup.com.ics

I tried to delete the calendar from my device and set it up again, but to no avail. After a bit of digging I got the impression that something was wrong with the calendar entry. Unfortunately the log file doesn’t say anything about the calendar entry other than the link. So I can’t look at the entry in Thunderbird or the SOGo web interface.
But I am running my own SOGO instance so I am able to check the SOGo database entry.

SOGo organizes data in a series of tables named sogo${username}${hash}. The c_name of every table corresponds with the event name from the URL. So we can look for the event and it’s content in the tables:

MariaDB [sogo]> select c_content from sogobenjamin0010de46299 where c_name like '%2930%';
+-----------+
| c_content |
+-----------+
| |
+-----------+
1 row in set (0.005 sec)

Apparently the event just has an empty c_content field which is what threw off DAVx. As I don’t know anything else about the event and there is nothing to restore, I just deleted the entry from the table:

MariaDB [sogo]> delete from sogobenjamin0010de46299 where c_name like '%2930%';
Query OK, 1 row affected (0.012 sec)

After that DAVx was able to sync again without issues.

Read More