I’ve gone on and on about how Large Language Models — so-called AIs — are crippled by training on bad data. Garbage In; Garbage Out! So let’s build a local LLM and feed it good data for a specific purpose. Welcome to Virtual Grad Student! We’re going to set up a Large Language Model to run locally, feed it a clean set of data, then make it available to authors as a virtual writer’s assistant. For example, to pull together a few paragraphs of background on Roman aqueduct architecture.
Links:
The h2oGPT open source Large Language Model: https://gpt.h2o.ai/
Common Corpus, the largest public domain dataset for training LLMs: https://huggingface.co/collections/PleIAs/common-corpus-65d46e3ea3980fdcd66a5613
Next on Perfecting Equilibrium
Friday March 22nd - Foto.Feola.Friday
Sunday March 24th — About that time I accidentally ended up on TV. Repeatedly. I well know that I have a face for radio. Despite that I’ve somehow ended up on television in everything from a Japanese miniseries to working as a TV reporter to a Pacific Stars & Stripes documentary.
Share this post