Difference between revisions of "Free Software Directory:Artificial Intelligence Team"
GrahamxReed (talk | contribs) m (→3D modeling: adding threestudio) |
|||
(24 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
[[File:Free Software Foundation-Free Software Directory-Artificial Intelligence Project Team.png|200px]] | [[File:Free Software Foundation-Free Software Directory-Artificial Intelligence Project Team.png|200px]] | ||
− | <onlyinclude>The Artificial Intelligence | + | <onlyinclude>The Artificial Intelligence Team gathers free software resources regarding machine learning / artificial intelligence-related topics. |
+ | |||
+ | </onlyinclude> | ||
{| class="wikitable sortable" border="1" | {| class="wikitable sortable" border="1" | ||
Line 49: | Line 51: | ||
|} | |} | ||
− | == | + | == Legal == |
− | + | === USA === | |
− | * [https://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook | + | * [https://ai.gov/ AI.gov] |
− | * | + | * [https://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook NIST AI Risk Management Framework Playbook] |
− | + | * [https://www.copyright.gov/ai/ai_policy_guidance.pdf Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence (MARCH 16, 2023)] | |
− | == | + | ** §3¶2: the Office will not register works produced by solely a prompt |
− | + | ** §3¶3: copyright will only protect the human-authored aspects of the work, which are “independent of ” and do “not affect” the copyright status of the AI-generated material itself. | |
− | * | + | === EU === |
− | * | + | * [https://artificialintelligenceact.eu/the-act/ AI Act (2024)] |
+ | **Title I, Art. 2, 5g: The obligations laid down in this Regulation shall not apply to AI systems released under [https://www.gnu.org/philosophy/floss-and-foss.html free and open source] licences unless they are placed on the market or put into service as high-risk AI systems or an AI system that falls under Title II and IV. | ||
+ | *** Title II: PROHIBITED ARTIFICIAL INTELLIGENCE PRACTICES | ||
+ | **** Generally mentions behavior manipulation techniques, social scores, how to approach biometric data, predicting criminal offences based solely on personality, facial databases | ||
+ | *** Title IV: TRANSPARENCY OBLIGATIONS | ||
+ | **** Generally mentions disclosure obligations that a product is AI-generated | ||
== Text == | == Text == | ||
Line 83: | Line 90: | ||
| [https://github.com/agnaistic/agnai#AGPL-3.0-1-ov-file AGPLv3] | | [https://github.com/agnaistic/agnai#AGPL-3.0-1-ov-file AGPLv3] | ||
| AI agnostic (multi-user and multi-bot) chat with fictional characters | | AI agnostic (multi-user and multi-bot) chat with fictional characters | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
! [https://github.com/nomic-ai/gpt4all GPT4All] | ! [https://github.com/nomic-ai/gpt4all GPT4All] | ||
Line 108: | Line 105: | ||
| [https://github.com/LostRuins/koboldcpp/blob/concedo/LICENSE.md AGPLv3] | | [https://github.com/LostRuins/koboldcpp/blob/concedo/LICENSE.md AGPLv3] | ||
| A simple one-file way to run various GGML and GGUF models with KoboldAI's UI | | A simple one-file way to run various GGML and GGUF models with KoboldAI's UI | ||
+ | |- | ||
+ | ! [https://github.com/ollama/ollama ollama] | ||
+ | | ollama | ||
+ | | [https://github.com/ollama/ollama?tab=MIT-1-ov-file#readme MIT] | ||
+ | | Get up and running with large language models locally | ||
|- | |- | ||
! [https://github.com/serge-chat/serge Serge] | ! [https://github.com/serge-chat/serge Serge] | ||
| serge-chat | | serge-chat | ||
| [https://github.com/serge-chat/serge/blob/main/LICENSE MIT] | | [https://github.com/serge-chat/serge/blob/main/LICENSE MIT] | ||
− | | A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API | + | | A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API |
|- | |- | ||
! [https://github.com/SillyTavern/SillyTavern SillyTavern] | ! [https://github.com/SillyTavern/SillyTavern SillyTavern] | ||
Line 142: | Line 144: | ||
| Mistral AI | | Mistral AI | ||
| [https://docs.mistral.ai/ Apache 2.0] | | [https://docs.mistral.ai/ Apache 2.0] | ||
− | | Mistral 7B significantly outperforms Llama 2 13B on all metrics, and is on par with Llama 34B | + | | Mistral 7B significantly outperforms Llama 2 13B on all metrics, and is on par with Llama 34B |
|- | |- | ||
! [https://mistral.ai/news/mixtral-of-experts/ Mixtral] | ! [https://mistral.ai/news/mixtral-of-experts/ Mixtral] | ||
| Mistral AI | | Mistral AI | ||
| [https://docs.mistral.ai/ Apache 2.0] | | [https://docs.mistral.ai/ Apache 2.0] | ||
− | | Mixtral outperforms Llama 2 70B on most benchmarks with 6x faster inference. | + | | Mixtral outperforms Llama 2 70B on most benchmarks with 6x faster inference |
+ | |- | ||
+ | |} | ||
+ | Honorary mention: LLaMA 1 had an [http://web.archive.org/web/20230224201551/https://github.com/facebookresearch/llama/blob/main/LICENSE AGPLv3] license. | ||
+ | |||
+ | ==== Retrieval Augmented Generation (RAG) UI's ==== | ||
+ | {| class="wikitable sortable" | ||
+ | |- | ||
+ | ! Project | ||
+ | ! Credit | ||
+ | ! License | ||
+ | ! Description | ||
+ | |- | ||
+ | ! [https://github.com/jonfairbanks/local-rag Local RAG] | ||
+ | | Jon Fairbanks (jonfairbanks) | ||
+ | | [https://github.com/jonfairbanks/local-rag?tab=GPL-3.0-1-ov-file GPLv3] | ||
+ | | Ingest files for retrieval augmented generation (RAG) with open-source Large Language Models (LLMs), all without 3rd parties or sensitive data leaving your network. | ||
|- | |- | ||
+ | |} | ||
+ | |||
+ | ====Evaluating LLMs==== | ||
+ | {| class="wikitable sortable" | ||
+ | |- | ||
+ | ! Project | ||
+ | ! Credit | ||
+ | ! License | ||
+ | ! Description | ||
+ | |- | ||
+ | ! [https://github.com/lm-sys/FastChat FastChat] | ||
+ | | lm-sys | ||
+ | | [https://github.com/lm-sys/FastChat/blob/main/LICENSE Apache-2.0] | ||
+ | | An open platform for training, serving, and evaluating large language models | ||
+ | |- | ||
+ | ! [https://github.com/promptfoo/promptfoo promptfoo] | ||
+ | | promptfoo | ||
+ | | [https://github.com/promptfoo/promptfoo?tab=MIT-1-ov-file#readme MIT] | ||
+ | | Test your prompts, models, RAGs. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality | ||
|} | |} | ||
====Other==== | ====Other==== | ||
Line 162: | Line 199: | ||
| [https://github.com/EleutherAI/gpt-neox/blob/main/LICENSE Apache 2.0] | | [https://github.com/EleutherAI/gpt-neox/blob/main/LICENSE Apache 2.0] | ||
| EleutherAI's library for training large-scale language models on GPUs | | EleutherAI's library for training large-scale language models on GPUs | ||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
! [https://github.com/LAION-AI/Open-Assistant Open Assistant] | ! [https://github.com/LAION-AI/Open-Assistant Open Assistant] | ||
Line 179: | Line 211: | ||
|- | |- | ||
|} | |} | ||
− | |||
=== Code generation === | === Code generation === | ||
This concept is controversial. See the FSF's other writing on this topic. | This concept is controversial. See the FSF's other writing on this topic. | ||
− | |||
− | |||
− | |||
− | |||
== Images == | == Images == | ||
Line 203: | Line 230: | ||
| [https://github.com/comfyanonymous/ComfyUI/blob/master/LICENSE GPLv3] | | [https://github.com/comfyanonymous/ComfyUI/blob/master/LICENSE GPLv3] | ||
| Modular Stable Diffusion GUI, API, and backend with a graph/nodes interface | | Modular Stable Diffusion GUI, API, and backend with a graph/nodes interface | ||
+ | |- | ||
+ | ! [https://github.com/Mikubill/sd-webui-controlnet ControlNet for Stable Diffusion WebUI] | ||
+ | | Mikubill (Kakigōri Maker) | ||
+ | | [https://github.com/Mikubill/sd-webui-controlnet#GPL-3.0-1-ov-file GPLv3] | ||
+ | | An AUTOMATIC1111 extension that adds ControlNet to the original Stable Diffusion model to generate images | ||
|- | |- | ||
! [https://github.com/AUTOMATIC1111/stable-diffusion-webui Stable Diffusion WebUI] | ! [https://github.com/AUTOMATIC1111/stable-diffusion-webui Stable Diffusion WebUI] | ||
Line 210: | Line 242: | ||
|- | |- | ||
|} | |} | ||
− | ==== Additional | + | ==== Additional libraries ==== |
{| class="wikitable sortable" | {| class="wikitable sortable" | ||
|- | |- | ||
Line 228: | Line 260: | ||
| State-of-the-art diffusion models for image and audio generation in PyTorch | | State-of-the-art diffusion models for image and audio generation in PyTorch | ||
|- | |- | ||
− | + | ! [https://github.com/leejet/stable-diffusion.cpp stable-diffusion.cpp] | |
− | + | | leejet | |
− | + | | [https://github.com/leejet/stable-diffusion.cpp?tab=MIT-1-ov-file#readme MIT] | |
− | + | | Stable Diffusion in pure C/C++ | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | ! [https://github.com/ | ||
− | | | ||
− | | [https://github.com/ | ||
− | |||
− | |||
− | |||
− | | | ||
− | |||
− | |||
|- | |- | ||
|} | |} | ||
Line 257: | Line 275: | ||
! License | ! License | ||
! Description | ! Description | ||
+ | |- | ||
+ | ! [https://github.com/guoyww/AnimateDiff AnimateDiff] | ||
+ | | guoyww (Yuwei Guo) | ||
+ | | [https://github.com/guoyww/AnimateDiff?tab=Apache-2.0-1-ov-file#readme Apache 2.0] | ||
+ | | A plug-and-play module turning most community models into animation generators, without the need of additional training | ||
|- | |- | ||
! [https://github.com/google-research/frame-interpolation FILM: Frame Interpolation for Large Motion] | ! [https://github.com/google-research/frame-interpolation FILM: Frame Interpolation for Large Motion] | ||
Line 265: | Line 288: | ||
! [https://github.com/magic-research/magic-animate MagicAnimate] | ! [https://github.com/magic-research/magic-animate MagicAnimate] | ||
| MagIC Research | | MagIC Research | ||
− | | [https://github.com/magic-research/magic-animate#BSD-3-Clause-1-ov-file BSD-3-Clause | + | | [https://github.com/magic-research/magic-animate#BSD-3-Clause-1-ov-file BSD-3-Clause] |
| Temporally consistent human image animation using a diffusion model | | Temporally consistent human image animation using a diffusion model | ||
|- | |- | ||
Line 276: | Line 299: | ||
| CiaraStrawberry (Ciara Rowles) | | CiaraStrawberry (Ciara Rowles) | ||
| [https://github.com/CiaraStrawberry/TemporalKit?tab=GPL-3.0-1-ov-file#readme GPLv3] | | [https://github.com/CiaraStrawberry/TemporalKit?tab=GPL-3.0-1-ov-file#readme GPLv3] | ||
− | | An all in one solution for adding temporal stability to a Stable Diffusion render via an | + | | An all in one solution for adding temporal stability to a Stable Diffusion render via an AUTOMATIC1111 extension |
|- | |- | ||
! [https://github.com/camenduru/text-to-video-synthesis-colab Text To Video Synthesis Colab] | ! [https://github.com/camenduru/text-to-video-synthesis-colab Text To Video Synthesis Colab] | ||
Line 290: | Line 313: | ||
|} | |} | ||
− | + | === Image recognition === | |
− | + | {| class="wikitable sortable" | |
− | === Image | + | |- |
− | + | ! Project | |
− | + | ! Credit | |
+ | ! License | ||
+ | ! Description | ||
+ | |- | ||
+ | ! [https://github.com/mit-han-lab/efficientvit EfficientViT] | ||
+ | | MIT Han Lab | ||
+ | | [https://github.com/mit-han-lab/efficientvit?tab=Apache-2.0-1-ov-file#readme Apache 2.0] | ||
+ | | A new family of vision models for efficient high-resolution vision | ||
+ | |- | ||
+ | ! [https://github.com/haotian-liu/LLaVA LLaVA] | ||
+ | | Haotian Liu | ||
+ | | [https://github.com/haotian-liu/LLaVA/blob/main/LICENSE Apache 2.0] | ||
+ | | Visual instruction tuning towards large language and vision models with GPT-4 level capabilities | ||
+ | |- | ||
+ | ! [https://github.com/Vision-CAIR/MiniGPT-4 MiniGPT-4] | ||
+ | | Vision-CAIR | ||
+ | | [https://github.com/Vision-CAIR/MiniGPT-4/blob/main/LICENSE.md BSD 3-Clause "New" or "Revised" License] & [https://github.com/Vision-CAIR/MiniGPT-4/blob/main/LICENSE_Lavis.md BSD 3-Clause License] | ||
+ | | Open-sourced code for large language models as a unified interface for vision-language multi-task learning | ||
+ | |} | ||
=== 3D modeling === | === 3D modeling === | ||
− | + | {| class="wikitable sortable" | |
− | + | |- | |
+ | ! Project | ||
+ | ! Credit | ||
+ | ! License | ||
+ | ! Description | ||
+ | |- | ||
+ | ! [https://github.com/threestudio-project/threestudio threestudio] | ||
+ | | threestudio | ||
+ | | [https://github.com/threestudio-project/threestudio#Apache-2.0-1-ov-file Apache 2.0] | ||
+ | | A unified framework for 3D content generation | ||
+ | |- | ||
+ | ! [https://github.com/VAST-AI-Research/TripoSR TripoSR] | ||
+ | | VAST-AI-Research | ||
+ | | [https://github.com/VAST-AI-Research/TripoSR#MIT-1-ov-file MIT] | ||
+ | | A fast feedforward 3D reconstruction from a single image. The model is MIT licensed too | ||
+ | |} | ||
== Audio == | == Audio == | ||
Line 307: | Line 363: | ||
* [[Vosk]] | * [[Vosk]] | ||
* [[Whisper]] | * [[Whisper]] | ||
− | |||
− | |||
− | |||
{| class="wikitable sortable" | {| class="wikitable sortable" | ||
|- | |- | ||
Line 315: | Line 368: | ||
! Credit | ! Credit | ||
! License | ! License | ||
+ | ! Description | ||
+ | |- | ||
+ | ! [https://github.com/ggerganov/whisper.cpp whisper.cpp] | ||
+ | | ggerganov (Georgi Gerganov) | ||
+ | | [https://github.com/ggerganov/whisper.cpp?tab=MIT-1-ov-file#readme MIT] | ||
+ | | Port of OpenAI's Whisper model in C/C++ | ||
+ | |} | ||
+ | ==== Synthesis (text to speech (TTS)) ==== | ||
+ | {| class="wikitable sortable" | ||
+ | |- | ||
+ | ! Project | ||
+ | ! Credit | ||
+ | ! License | ||
+ | ! Description | ||
|- | |- | ||
! [https://github.com/suno-ai/bark Bark] | ! [https://github.com/suno-ai/bark Bark] | ||
| Suno AI | | Suno AI | ||
− | | [https://github.com/suno-ai/bark/blob/main/LICENSE MIT | + | | [https://github.com/suno-ai/bark/blob/main/LICENSE MIT] |
+ | | Text-Prompted Generative Audio Model | ||
|- | |- | ||
! [https://github.com/coqui-ai/TTS Coqui TTS] | ! [https://github.com/coqui-ai/TTS Coqui TTS] | ||
| Coqui AI | | Coqui AI | ||
| [https://github.com/coqui-ai/TTS/blob/dev/LICENSE.txt MPL 2.0] | | [https://github.com/coqui-ai/TTS/blob/dev/LICENSE.txt MPL 2.0] | ||
+ | | A deep learning toolkit for Text-to-Speech, battle-tested in research and production | ||
+ | |- | ||
+ | ! [https://github.com/neonbjb/tortoise-tts TorToiSe] | ||
+ | | neonbjb (James Betker) | ||
+ | | [https://github.com/neonbjb/tortoise-tts/blob/main/LICENSE Apache 2.0] | ||
+ | | A multi-voice TTS system trained with an emphasis on quality | ||
+ | |- | ||
+ | ! [https://github.com/collabora/WhisperSpeech?tab=readme-ov-file WhisperSpeech] | ||
+ | | Collabora | ||
+ | | [https://github.com/collabora/WhisperSpeech?tab=MIT-1-ov-file#readme MIT] | ||
+ | | Created using only properly licensed speech recordings so the model & code will be always safe to use for commercial applications | ||
+ | |} | ||
+ | |||
+ | ==== Transmogrify (speech to speech (STS)) ==== | ||
+ | {| class="wikitable sortable" | ||
+ | |- | ||
+ | ! Project | ||
+ | ! Credit | ||
+ | ! License | ||
+ | ! Description | ||
+ | |- | ||
+ | ! [https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI Retrieval-based-Voice-Conversion-WebUI] | ||
+ | | RVC-Project | ||
+ | | [https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI#MIT-1-ov-file MIT] | ||
+ | | Voice data <= 10 mins can also be used to train a good VC model | ||
|- | |- | ||
! [https://github.com/voicepaw/so-vits-svc-fork SoftVC VITS Singing Voice Conversion Fork] | ! [https://github.com/voicepaw/so-vits-svc-fork SoftVC VITS Singing Voice Conversion Fork] | ||
| voicepaw | | voicepaw | ||
− | | [https://github.com/voicepaw/so-vits-svc-fork/blob/main/LICENSE Apache | + | | [https://github.com/voicepaw/so-vits-svc-fork/blob/main/LICENSE Apache 2.0 & MIT] |
− | |- | + | | so-vits-svc fork with realtime support, improved interface and more features |
− | |||
− | |||
− | |||
|- | |- | ||
|} | |} | ||
=== Music === | === Music === | ||
+ | ====Audio Diffusion==== | ||
+ | {| class="wikitable sortable" | ||
+ | |- | ||
+ | ! Project | ||
+ | ! Credit | ||
+ | ! License | ||
+ | ! Description | ||
+ | |- | ||
+ | ! [https://github.com/Harmonai-org Audio Diffusion] | ||
+ | | Harmonai | ||
+ | | MIT, MIT, MIT | ||
+ | | A Stability AI lab focused on open-source generative audio models | ||
+ | |} | ||
− | + | ==== Music splitters ==== | |
− | + | {| class="wikitable sortable" | |
+ | |- | ||
+ | ! Project | ||
+ | ! Credit | ||
+ | ! License | ||
+ | ! Description | ||
+ | |- | ||
+ | ! [https://github.com/facebookresearch/demucs/tree/hybrid Demucs (v3)] | ||
+ | | Meta Research | ||
+ | | [https://github.com/facebookresearch/demucs/blob/main/LICENSE MIT] | ||
+ | | Code for the paper Hybrid Spectrogram and Waveform Source Separation | ||
+ | |- | ||
+ | ! [https://github.com/fabiogra/moseca Moseca] | ||
+ | | fabiogra (Fabio Grasso) | ||
+ | | [https://github.com/fabiogra/moseca?tab=MIT-1-ov-file#readme MIT] | ||
+ | | A Streamilt web app for music source separation & karaoke | ||
+ | |- | ||
+ | ! [https://github.com/deezer/spleeter spleeter] | ||
+ | | deezer | ||
+ | | [https://github.com/deezer/spleeter?tab=MIT-1-ov-file#readme MIT] | ||
+ | | Deezer source separation library including pretrained models | ||
+ | |} | ||
==Uncategorized== | ==Uncategorized== | ||
* Virtual assistant: [[Mycroft]] | * Virtual assistant: [[Mycroft]] |
Revision as of 14:09, 5 April 2024
The Artificial Intelligence Team gathers free software resources regarding machine learning / artificial intelligence-related topics.
Group info | User info | |||||
---|---|---|---|---|---|---|
User | Role | Reference | Real name | libera.chat nick | Time zone | Title |
David_Hedlund | Coordinator | David Hedlund | David_Hedlund | Europe/Stockholm | ||
GrahamxReed | Collaborator | Graham Reed | Graham_Reed | America/New_York | ||
Mertgor | Observer | Mert Gör | hwpplayer1 | Europe/Istanbul | ||
Mmcmahon | Team captain | Michael McMahon | thomzane | America/New_York | FSF Systems Administrator |
Contents
Legal
USA
- AI.gov
- NIST AI Risk Management Framework Playbook
- Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence (MARCH 16, 2023)
- §3¶2: the Office will not register works produced by solely a prompt
- §3¶3: copyright will only protect the human-authored aspects of the work, which are “independent of ” and do “not affect” the copyright status of the AI-generated material itself.
EU
- AI Act (2024)
- Title I, Art. 2, 5g: The obligations laid down in this Regulation shall not apply to AI systems released under free and open source licences unless they are placed on the market or put into service as high-risk AI systems or an AI system that falls under Title II and IV.
- Title II: PROHIBITED ARTIFICIAL INTELLIGENCE PRACTICES
- Generally mentions behavior manipulation techniques, social scores, how to approach biometric data, predicting criminal offences based solely on personality, facial databases
- Title IV: TRANSPARENCY OBLIGATIONS
- Generally mentions disclosure obligations that a product is AI-generated
- Title II: PROHIBITED ARTIFICIAL INTELLIGENCE PRACTICES
- Title I, Art. 2, 5g: The obligations laid down in this Regulation shall not apply to AI systems released under free and open source licences unless they are placed on the market or put into service as high-risk AI systems or an AI system that falls under Title II and IV.
Text
Grammar
Translation
Text generation
Front ends
Project | Credit | License | Description |
---|---|---|---|
Agnai | agnaistic | AGPLv3 | AI agnostic (multi-user and multi-bot) chat with fictional characters |
GPT4All | Nomic AI | MIT | Run open-source LLMs anywhere |
KoboldAI | KoboldAI | AGPLv3 | A browser-based front-end for AI-assisted writing with multiple local & remote AI models |
koboldcpp | LostRuins | AGPLv3 | A simple one-file way to run various GGML and GGUF models with KoboldAI's UI |
ollama | ollama | MIT | Get up and running with large language models locally |
Serge | serge-chat | MIT | A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API |
SillyTavern | SillyTavern | AGPLv3 | LLM frontend for power users |
Text Generation Web UI | oobabooga | AGPLv3 | A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), LLaMA models |
Models
Project | Credit | License | Description |
---|---|---|---|
llama.cpp | ggerganov (Georgi Gerganov) | MIT | Port of Facebook's LLaMA model in C/C++ |
Mistral | Mistral AI | Apache 2.0 | Mistral 7B significantly outperforms Llama 2 13B on all metrics, and is on par with Llama 34B |
Mixtral | Mistral AI | Apache 2.0 | Mixtral outperforms Llama 2 70B on most benchmarks with 6x faster inference |
Honorary mention: LLaMA 1 had an AGPLv3 license.
Retrieval Augmented Generation (RAG) UI's
Project | Credit | License | Description |
---|---|---|---|
Local RAG | Jon Fairbanks (jonfairbanks) | GPLv3 | Ingest files for retrieval augmented generation (RAG) with open-source Large Language Models (LLMs), all without 3rd parties or sensitive data leaving your network. |
Evaluating LLMs
Project | Credit | License | Description |
---|---|---|---|
FastChat | lm-sys | Apache-2.0 | An open platform for training, serving, and evaluating large language models |
promptfoo | promptfoo | MIT | Test your prompts, models, RAGs. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality |
Other
Project | Credit | License | Description |
---|---|---|---|
GPT-NeoX | EleutherAI | Apache 2.0 | EleutherAI's library for training large-scale language models on GPUs |
Open Assistant | LAION-AI | Apache 2.0 | A chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so |
Open Interpreter | KillianLucas | AGPLv3 | Lets LLMs run code (Python, Javascript, Shell, and more) locally |
Code generation
This concept is controversial. See the FSF's other writing on this topic.
Images
Image generation
Text to image GUI
Project | Credit | License | Description |
---|---|---|---|
ComfyUI | comfyanonymous | GPLv3 | Modular Stable Diffusion GUI, API, and backend with a graph/nodes interface |
ControlNet for Stable Diffusion WebUI | Mikubill (Kakigōri Maker) | GPLv3 | An AUTOMATIC1111 extension that adds ControlNet to the original Stable Diffusion model to generate images |
Stable Diffusion WebUI | AUTOMATIC1111 | AGPLv3 | Stable Diffusion web UI |
Additional libraries
Project | Credit | License | Description |
---|---|---|---|
ControlNet | lllyasviel | Apache-2.0 | Adding conditional control to text-to-image diffusion models |
Diffusers | huggingface | Apache-2.0 | State-of-the-art diffusion models for image and audio generation in PyTorch |
stable-diffusion.cpp | leejet | MIT | Stable Diffusion in pure C/C++ |
Videos
Image generation techniques create pictures from noise estimations. This noise shows up as artifacts and hampers temporal stability for objects. These projects tackle that issue.
Project | Credit | License | Description |
---|---|---|---|
AnimateDiff | guoyww (Yuwei Guo) | Apache 2.0 | A plug-and-play module turning most community models into animation generators, without the need of additional training |
FILM: Frame Interpolation for Large Motion | Google Research | Apache 2.0 | A unified single-network approach to frame interpolation that doesn't use additional pre-trained networks |
MagicAnimate | MagIC Research | BSD-3-Clause | Temporally consistent human image animation using a diffusion model |
MotionCtrl | ARC Lab, Tencent PCG | Apache 2.0 | A unified and flexible motion controller for video generation |
TemporalKit | CiaraStrawberry (Ciara Rowles) | GPLv3 | An all in one solution for adding temporal stability to a Stable Diffusion render via an AUTOMATIC1111 extension |
Text To Video Synthesis Colab | camenduru | The Unlicense | A text-to-video synthesis model that evolves from a text-to-image synthesis model |
Thin-Plate Spline Motion Model for Image Animation | yoyo-nb | MIT | Animates a static object in a source image according to a driving video |
Image recognition
Project | Credit | License | Description |
---|---|---|---|
EfficientViT | MIT Han Lab | Apache 2.0 | A new family of vision models for efficient high-resolution vision |
LLaVA | Haotian Liu | Apache 2.0 | Visual instruction tuning towards large language and vision models with GPT-4 level capabilities |
MiniGPT-4 | Vision-CAIR | BSD 3-Clause "New" or "Revised" License & BSD 3-Clause License | Open-sourced code for large language models as a unified interface for vision-language multi-task learning |
3D modeling
Project | Credit | License | Description |
---|---|---|---|
threestudio | threestudio | Apache 2.0 | A unified framework for 3D content generation |
TripoSR | VAST-AI-Research | MIT | A fast feedforward 3D reconstruction from a single image. The model is MIT licensed too |
Audio
Natural language processing (NLP)
Transcription (Speech to text (STT))
Project | Credit | License | Description |
---|---|---|---|
whisper.cpp | ggerganov (Georgi Gerganov) | MIT | Port of OpenAI's Whisper model in C/C++ |
Synthesis (text to speech (TTS))
Project | Credit | License | Description |
---|---|---|---|
Bark | Suno AI | MIT | Text-Prompted Generative Audio Model |
Coqui TTS | Coqui AI | MPL 2.0 | A deep learning toolkit for Text-to-Speech, battle-tested in research and production |
TorToiSe | neonbjb (James Betker) | Apache 2.0 | A multi-voice TTS system trained with an emphasis on quality |
WhisperSpeech | Collabora | MIT | Created using only properly licensed speech recordings so the model & code will be always safe to use for commercial applications |
Transmogrify (speech to speech (STS))
Project | Credit | License | Description |
---|---|---|---|
Retrieval-based-Voice-Conversion-WebUI | RVC-Project | MIT | Voice data <= 10 mins can also be used to train a good VC model |
SoftVC VITS Singing Voice Conversion Fork | voicepaw | Apache 2.0 & MIT | so-vits-svc fork with realtime support, improved interface and more features |
Music
Audio Diffusion
Project | Credit | License | Description |
---|---|---|---|
Audio Diffusion | Harmonai | MIT, MIT, MIT | A Stability AI lab focused on open-source generative audio models |
Music splitters
Project | Credit | License | Description |
---|---|---|---|
Demucs (v3) | Meta Research | MIT | Code for the paper Hybrid Spectrogram and Waveform Source Separation |
Moseca | fabiogra (Fabio Grasso) | MIT | A Streamilt web app for music source separation & karaoke |
spleeter | deezer | MIT | Deezer source separation library including pretrained models |
Uncategorized
- Virtual assistant: Mycroft
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the page “GNU Free Documentation License”.
The copyright and license notices on this page only apply to the text on this page. Any software or copyright-licenses or other similar notices described in this text has its own copyright notice and license, which can usually be found in the distribution or license text itself.