Difference between revisions of "Free Software Directory:Artificial Intelligence Team"

From Free Software Directory
Jump to: navigation, search
m (table formatting and fixing some ordering)
m (adding mixtral 8x22, as the raw weights are listed as a magnet link on their website independently now (instruct pending))
 
(30 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
[[File:Free Software Foundation-Free Software Directory-Artificial Intelligence Project Team.png|200px]]
 
[[File:Free Software Foundation-Free Software Directory-Artificial Intelligence Project Team.png|200px]]
  
<onlyinclude>The Artificial Intelligence Project Team gathers free software resources regarding machine learning / artificial intelligence.</onlyinclude>
+
<onlyinclude>The Artificial Intelligence Team gathers free software resources regarding machine learning / artificial intelligence-related topics.
 +
 
 +
</onlyinclude>
  
 
{| class="wikitable sortable" border="1"
 
{| class="wikitable sortable" border="1"
Line 49: Line 51:
 
|}
 
|}
  
== Truthfulness ==
+
== Legal ==
 
+
=== USA ===
* [https://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook US NIST AI Risk Management Framework Playbook]
+
* [https://ai.gov/ AI.gov]
* A science paper benchmark: [https://arxiv.org/abs/2109.07958 TruthfulQA: Measuring How Models Mimic Human Falsehoods]
+
* [https://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook NIST AI Risk Management Framework Playbook]
 
+
* [https://www.copyright.gov/ai/ai_policy_guidance.pdf Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence (MARCH 16, 2023)]
== Transformers ==
+
** §3¶2: the Office will not register works produced by solely a prompt
 
+
** §3¶3: copyright will only protect the human-authored aspects of the work, which are “independent of ” and do “not affect” the copyright status of the AI-generated material itself.
* Package manager and player: [[Transformers]] - https://github.com/huggingface/transformers/
+
=== EU ===
* Original science paper: [https://arxiv.org/abs/1706.03762 Attention Is All You Need]
+
* [https://artificialintelligenceact.eu/the-act/ AI Act (2024)]
 +
**Title I, Art. 2, 5g: The obligations laid down in this Regulation shall not apply to AI systems released under [https://www.gnu.org/philosophy/floss-and-foss.html free and open source] licences unless they are placed on the market or put into service as high-risk AI systems or an AI system that falls under Title II and IV.
 +
*** Title II: PROHIBITED ARTIFICIAL INTELLIGENCE PRACTICES
 +
**** Generally mentions behavior manipulation techniques, social scores, how to approach biometric data, predicting criminal offences based solely on personality, facial databases
 +
*** Title IV: TRANSPARENCY OBLIGATIONS
 +
**** Generally mentions disclosure obligations that a product is AI-generated
  
 
== Text ==
 
== Text ==
Line 71: Line 78:
  
 
=== Text generation ===
 
=== Text generation ===
 +
====Front ends====
 
{| class="wikitable sortable"  
 
{| class="wikitable sortable"  
 
|-
 
|-
Line 76: Line 84:
 
! Credit  
 
! Credit  
 
! License
 
! License
 +
! Description
 +
|-
 +
! [https://github.com/agnaistic/agnai Agnai]
 +
| agnaistic
 +
| [https://github.com/agnaistic/agnai#AGPL-3.0-1-ov-file AGPLv3]
 +
| AI agnostic (multi-user and multi-bot) chat with fictional characters
 
|-
 
|-
! [https://github.com/EleutherAI/gpt-neox GPT-NeoX]  
+
! [https://github.com/nomic-ai/gpt4all GPT4All]  
| EleutherAI
+
| Nomic AI
| [https://github.com/EleutherAI/gpt-neox/blob/main/LICENSE Apache 2.0]
+
| [https://github.com/nomic-ai/gpt4all/blob/main/LICENSE.txt MIT]
 +
| Run open-source LLMs anywhere
 
|-
 
|-
 
! [https://github.com/KoboldAI/KoboldAI-Client KoboldAI]  
 
! [https://github.com/KoboldAI/KoboldAI-Client KoboldAI]  
 
| KoboldAI  
 
| KoboldAI  
 
| [https://github.com/KoboldAI/KoboldAI-Client/blob/main/LICENSE.md AGPLv3]
 
| [https://github.com/KoboldAI/KoboldAI-Client/blob/main/LICENSE.md AGPLv3]
 +
| A browser-based front-end for AI-assisted writing with multiple local & remote AI models
 +
|-
 +
! [https://github.com/LostRuins/koboldcpp koboldcpp]
 +
| LostRuins
 +
| [https://github.com/LostRuins/koboldcpp/blob/concedo/LICENSE.md AGPLv3]
 +
| A simple one-file way to run various GGML and GGUF models with KoboldAI's UI
 
|-
 
|-
! [https://github.com/LAION-AI/Open-Assistant Open Assistant]  
+
! [https://github.com/ollama/ollama ollama]
| LAION-AI 
+
| ollama
| [https://github.com/LAION-AI/Open-Assistant/blob/main/LICENSE Apache 2.0]
+
| [https://github.com/ollama/ollama?tab=MIT-1-ov-file#readme MIT]
 +
| Get up and running with large language models locally
 
|-
 
|-
! [https://github.com/KillianLucas/open-interpreter Open Interpreter]  
+
! [https://github.com/serge-chat/serge Serge]  
| KillianLucas
+
| serge-chat
| [https://github.com/KillianLucas/open-interpreter/blob/main/LICENSE MIT]
+
| [https://github.com/serge-chat/serge/blob/main/LICENSE MIT]
 +
| A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API
 
|-
 
|-
 
! [https://github.com/SillyTavern/SillyTavern SillyTavern]  
 
! [https://github.com/SillyTavern/SillyTavern SillyTavern]  
 
| SillyTavern  
 
| SillyTavern  
 
| [https://github.com/SillyTavern/SillyTavern/blob/release/LICENSE AGPLv3]
 
| [https://github.com/SillyTavern/SillyTavern/blob/release/LICENSE AGPLv3]
|-
+
| LLM frontend for power users
! [https://github.com/stability-AI/stableLM/ StableLM]
 
| Stability AI
 
| [https://github.com/Stability-AI/StableLM/blob/main/LICENSE Apache 2.0]
 
 
|-
 
|-
 
! [https://github.com/oobabooga/text-generation-webui Text Generation Web UI]  
 
! [https://github.com/oobabooga/text-generation-webui Text Generation Web UI]  
 
| oobabooga  
 
| oobabooga  
 
| [https://github.com/oobabooga/text-generation-webui/blob/main/LICENSE AGPLv3]
 
| [https://github.com/oobabooga/text-generation-webui/blob/main/LICENSE AGPLv3]
 +
| A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), LLaMA models
 +
|-
 +
|}
 +
 +
====Models====
 +
{| class="wikitable sortable"
 +
|-
 +
! Project
 +
! Credit
 +
! License
 +
! Description
 +
|-
 +
! [https://github.com/ggerganov/llama.cpp llama.cpp]
 +
| ggerganov (Georgi Gerganov)
 +
| [https://github.com/ggerganov/llama.cpp/blob/master/LICENSE MIT]
 +
| Port of Facebook's LLaMA model in C/C++
 +
|-
 +
! [https://mistral.ai/news/announcing-mistral-7b/ Mistral 7B]
 +
| Mistral AI
 +
| [https://docs.mistral.ai/getting-started/open_weight_models/ Apache 2.0]
 +
| The first dense model released by Mistral AI, perfect for experimentation, customization, and quick iteration. At the time of the release, it matched the capabilities of models up to 30B parameters
 +
|-
 +
! [https://mistral.ai/news/mixtral-of-experts/ Mixtral 8x7B]
 +
| Mistral AI
 +
| [https://docs.mistral.ai/getting-started/open_weight_models/ Apache 2.0]
 +
| A sparse mixture of experts model. As such, it leverages up to 45B parameters but only uses about 12B during inference, leading to better inference throughput at the cost of more vRAM
 +
|-
 +
! [https://mistral.ai/news/mixtral-8x22b/ Mixtral 8x22B]
 +
| Mistral AI
 +
| [https://docs.mistral.ai/getting-started/open_weight_models/ Apache 2.0]
 +
| A bigger sparse mixture of experts model with larger context window. As such, it leverages up to 176B parameters but only uses about 39B during inference, leading to better inference throughput at the cost of more vRAM
 
|-
 
|-
! [https://github.com/wawawario2/long_term_memory Text Generation Web UI with Long-Term Memory] 
 
| wawawario2
 
| [https://github.com/wawawario2/long_term_memory/blob/master/LICENSE AGPLv3]
 
 
|}
 
|}
{| class="wikitable sortable"
+
Honorary mention: LLaMA 1 had an [http://web.archive.org/web/20230224201551/https://github.com/facebookresearch/llama/blob/main/LICENSE AGPLv3] license.
|+ LLaMA-based
+
 
 +
==== Retrieval Augmented Generation (RAG) UI's ====
 +
{| class="wikitable sortable"  
 
|-
 
|-
 
! Project  
 
! Project  
 
! Credit  
 
! Credit  
 
! License
 
! License
 +
! Description
 +
|-
 +
! [https://github.com/jonfairbanks/local-rag Local RAG]
 +
| Jon Fairbanks (jonfairbanks)
 +
| [https://github.com/jonfairbanks/local-rag?tab=GPL-3.0-1-ov-file GPLv3]
 +
| Ingest files for retrieval augmented generation (RAG) with open-source Large Language Models (LLMs), all without 3rd parties or sensitive data leaving your network
 +
|-
 +
|}
 +
 +
====Evaluating LLMs====
 +
{| class="wikitable sortable"
 
|-
 
|-
! [https://github.com/AmineDiro/cria cria - LLaMA ported to Rust]
+
! Project
| AmineDiro
+
! Credit
| [https://github.com/AmineDiro/cria/blob/main/LICENSE MIT]
+
! License
 +
! Description
 
|-
 
|-
 
! [https://github.com/lm-sys/FastChat FastChat]  
 
! [https://github.com/lm-sys/FastChat FastChat]  
 
| lm-sys  
 
| lm-sys  
 
| [https://github.com/lm-sys/FastChat/blob/main/LICENSE Apache-2.0]
 
| [https://github.com/lm-sys/FastChat/blob/main/LICENSE Apache-2.0]
 +
| An open platform for training, serving, and evaluating large language models
 
|-
 
|-
! [https://github.com/nomic-ai/gpt4all GPT4All]  
+
! [https://github.com/promptfoo/promptfoo promptfoo]
| Nomic AI
+
| promptfoo
| [https://github.com/nomic-ai/gpt4all/blob/main/LICENSE.txt MIT]
+
| [https://github.com/promptfoo/promptfoo?tab=MIT-1-ov-file#readme MIT]
 +
| Test your prompts, models, RAGs. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality
 +
|}
 +
====Other====
 +
{| class="wikitable sortable"
 
|-
 
|-
! [https://github.com/LostRuins/koboldcpp koboldcpp]
+
! Project
| LostRuins
+
! Credit
| [https://github.com/LostRuins/koboldcpp/blob/concedo/LICENSE.md AGPLv3]
+
! License
 +
! Description
 
|-
 
|-
! [https://github.com/facebookresearch/llama LLaMA] [[LLaMA|FSD]]  
+
! [https://github.com/EleutherAI/gpt-neox GPT-NeoX]  
| Meta Research
+
| EleutherAI
| [https://github.com/facebookresearch/llama/blob/main/LICENSE GPLv3]
+
| [https://github.com/EleutherAI/gpt-neox/blob/main/LICENSE Apache 2.0]
 +
| EleutherAI's library for training large-scale language models on GPUs
 
|-
 
|-
! [https://github.com/ggerganov/llama.cpp LLaMA ported to C/C++] [[LLaMA.cpp|FSD]]  
+
! [https://github.com/LAION-AI/Open-Assistant Open Assistant]  
| ggerganov (Georgi Gerganov)
+
| LAION-AI 
| [https://github.com/ggerganov/llama.cpp/blob/master/LICENSE MIT]
+
| [https://github.com/LAION-AI/Open-Assistant/blob/main/LICENSE Apache 2.0]
 +
| A chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so
 
|-
 
|-
! [https://github.com/artidoro/qlora QLoRA]  
+
! [https://github.com/KillianLucas/open-interpreter Open Interpreter]  
| artidoro
+
| KillianLucas
| [https://github.com/artidoro/qlora/blob/main/LICENSE.txt MIT]
+
| [https://github.com/KillianLucas/open-interpreter/blob/main/LICENSE AGPLv3]
 +
| Lets LLMs run code (Python, Javascript, Shell, and more) locally
 
|-
 
|-
! [https://github.com/serge-chat/serge Serge]
 
| serge-chat
 
| [https://github.com/serge-chat/serge/blob/main/LICENSE MIT]
 
|-
 
! [https://github.com/nlpxucan/WizardLM WizardLM]
 
| nlpxucan
 
| [https://github.com/nlpxucan/WizardLM/blob/main/README.md Apache-2.0]
 
 
|}
 
|}
  
Line 156: Line 220:
  
 
This concept is controversial. See the FSF's other writing on this topic.
 
This concept is controversial. See the FSF's other writing on this topic.
 
* [https://github.com/salesforce/CodeGen CodeGen] by Salesforce [[CodeGen|FSD]] | [https://github.com/salesforce/CodeGen/blob/main/LICENSE.txt BSD 3-Clause "New" or "Revised" License]
 
** [https://github.com/ravenscroftj/turbopilot TurboPilot] by ravenscroftj (James Ravenscroft) [[TurboPilot|FSD]] | [https://github.com/ravenscroftj/turbopilot/blob/main/LICENSE.md BSD 3-Clause "New" or "Revised" License]
 
* [https://github.com/salesforce/CodeGen2 CodeGen2] by Salesforce [[CodeGen2|FSD]] | [https://github.com/salesforce/CodeGen2/blob/main/LICENSE Apache-2.0]
 
  
 
== Images ==
 
== Images ==
 
=== Image generation ===
 
=== Image generation ===
 +
==== Text to image GUI ====
 
{| class="wikitable sortable"
 
{| class="wikitable sortable"
 
|-
 
|-
Line 168: Line 229:
 
! Credit  
 
! Credit  
 
! License
 
! License
 +
! Description
 
|-
 
|-
 
! [https://github.com/comfyanonymous/ComfyUI ComfyUI]  
 
! [https://github.com/comfyanonymous/ComfyUI ComfyUI]  
 
| comfyanonymous  
 
| comfyanonymous  
 
| [https://github.com/comfyanonymous/ComfyUI/blob/master/LICENSE GPLv3]
 
| [https://github.com/comfyanonymous/ComfyUI/blob/master/LICENSE GPLv3]
 +
| Modular Stable Diffusion GUI, API, and backend with a graph/nodes interface
 +
|-
 +
! [https://github.com/Mikubill/sd-webui-controlnet ControlNet for Stable Diffusion WebUI]
 +
| Mikubill (Kakigōri Maker)
 +
| [https://github.com/Mikubill/sd-webui-controlnet#GPL-3.0-1-ov-file GPLv3]
 +
| An AUTOMATIC1111 extension that adds ControlNet to the original Stable Diffusion model to generate images
 +
|-
 +
! [https://github.com/AUTOMATIC1111/stable-diffusion-webui Stable Diffusion WebUI]
 +
| AUTOMATIC1111
 +
| [https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/LICENSE.txt AGPLv3]
 +
| Stable Diffusion web UI
 +
|-
 +
|}
 +
==== Additional libraries ====
 +
{| class="wikitable sortable"
 +
|-
 +
! Project
 +
! Credit
 +
! License
 +
! Description
 
|-
 
|-
 
! [https://github.com/lllyasviel/ControlNet ControlNet]  
 
! [https://github.com/lllyasviel/ControlNet ControlNet]  
 
| lllyasviel  
 
| lllyasviel  
 
| [https://github.com/lllyasviel/ControlNet/blob/main/LICENSE Apache-2.0]
 
| [https://github.com/lllyasviel/ControlNet/blob/main/LICENSE Apache-2.0]
|-
+
| Adding conditional control to text-to-image diffusion models
! [https://github.com/borisdayma/dalle-mini/ DALL-E Mini]
 
| borisdayma (Boris Dayma)
 
| [https://github.com/borisdayma/dalle-mini/blob/main/LICENSE Apache 2.0]
 
 
|-
 
|-
 
! [https://github.com/huggingface/diffusers Diffusers]  
 
! [https://github.com/huggingface/diffusers Diffusers]  
 
| huggingface  
 
| huggingface  
 
| [https://github.com/huggingface/diffusers/blob/main/LICENSE Apache-2.0]
 
| [https://github.com/huggingface/diffusers/blob/main/LICENSE Apache-2.0]
 +
| State-of-the-art diffusion models for image and audio generation in PyTorch
 +
|-
 +
! [https://github.com/leejet/stable-diffusion.cpp stable-diffusion.cpp]
 +
| leejet
 +
| [https://github.com/leejet/stable-diffusion.cpp?tab=MIT-1-ov-file#readme MIT]
 +
| Stable Diffusion in pure C/C++
 +
|-
 +
|}
 +
 +
===Videos===
 +
Image generation techniques create pictures from noise estimations. This noise shows up as artifacts and hampers temporal stability for objects. These projects tackle that issue.
 +
{| class="wikitable sortable"
 +
|-
 +
! Project
 +
! Credit
 +
! License
 +
! Description
 
|-
 
|-
! [https://github.com/anishathalye/neural-style neural-style]  
+
! [https://github.com/guoyww/AnimateDiff AnimateDiff]
| anishathalye
+
| guoyww (Yuwei Guo)
| [https://github.com/anishathalye/neural-style/blob/master/LICENSE.txt GPLv3]
+
| [https://github.com/guoyww/AnimateDiff?tab=Apache-2.0-1-ov-file#readme Apache 2.0]
 +
| A plug-and-play module turning most community models into animation generators, without the need of additional training
 
|-
 
|-
! [https://github.com/Stability-AI/generative-models SDXL - generative-models]  
+
! [https://github.com/google-research/frame-interpolation FILM: Frame Interpolation for Large Motion]  
| Stability-AI
+
| Google Research
| [https://github.com/Stability-AI/generative-models/blob/main/LICENSE-CODE MIT]
+
| [https://github.com/google-research/frame-interpolation#Apache-2.0-1-ov-file Apache 2.0]
 +
| A unified single-network approach to frame interpolation that doesn't use additional pre-trained networks
 
|-
 
|-
! [https://github.com/AUTOMATIC1111/stable-diffusion-webui Stable Diffusion WebUI]  
+
! [https://github.com/magic-research/magic-animate MagicAnimate]
| AUTOMATIC1111
+
| MagIC Research
| [https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/LICENSE.txt AGPLv3]
+
| [https://github.com/magic-research/magic-animate#BSD-3-Clause-1-ov-file  BSD-3-Clause]
 +
| Temporally consistent human image animation using a diffusion model
 +
|-
 +
! [https://github.com/TencentARC/MotionCtrl MotionCtrl]
 +
| ARC Lab, Tencent PCG
 +
| [https://github.com/TencentARC/MotionCtrl#Apache-2.0-1-ov-file Apache 2.0]
 +
| A unified and flexible motion controller for video generation
 +
|-
 +
! [https://github.com/CiaraStrawberry/TemporalKit TemporalKit]
 +
| CiaraStrawberry (Ciara Rowles)
 +
| [https://github.com/CiaraStrawberry/TemporalKit?tab=GPL-3.0-1-ov-file#readme GPLv3]
 +
| An all in one solution for adding temporal stability to a Stable Diffusion render via an AUTOMATIC1111 extension
 
|-
 
|-
 
! [https://github.com/camenduru/text-to-video-synthesis-colab Text To Video Synthesis Colab]  
 
! [https://github.com/camenduru/text-to-video-synthesis-colab Text To Video Synthesis Colab]  
 
| camenduru  
 
| camenduru  
 
| [https://github.com/camenduru/text-to-video-synthesis-colab/blob/main/LICENSE The Unlicense]
 
| [https://github.com/camenduru/text-to-video-synthesis-colab/blob/main/LICENSE The Unlicense]
 +
| A text-to-video synthesis model that evolves from a text-to-image synthesis model
 
|-
 
|-
 
! [https://github.com/yoyo-nb/Thin-Plate-Spline-Motion-Model Thin-Plate Spline Motion Model for Image Animation]  
 
! [https://github.com/yoyo-nb/Thin-Plate-Spline-Motion-Model Thin-Plate Spline Motion Model for Image Animation]  
 
| yoyo-nb  
 
| yoyo-nb  
 
| [https://github.com/yoyo-nb/Thin-Plate-Spline-Motion-Model/blob/main/LICENSE MIT]
 
| [https://github.com/yoyo-nb/Thin-Plate-Spline-Motion-Model/blob/main/LICENSE MIT]
 +
| Animates a static object in a source image according to a driving video
 +
|-
 
|}
 
|}
  
=== Image captioning ===
+
=== Image recognition ===
* [https://github.com/Vision-CAIR/MiniGPT-4 MiniGPT-4] by Vision-CAIR | [https://github.com/Vision-CAIR/MiniGPT-4/blob/main/LICENSE.md BSD 3-Clause "New" or "Revised" License] & [https://github.com/Vision-CAIR/MiniGPT-4/blob/main/LICENSE_Lavis.md BSD 3-Clause License]
+
{| class="wikitable sortable"
* [https://github.com/haotian-liu/LLaVA LLaVA] by Haotian Liu | [https://github.com/haotian-liu/LLaVA/blob/main/LICENSE Apache License 2.0]
+
|-
 +
! Project
 +
! Credit
 +
! License
 +
! Description
 +
|-
 +
! [https://github.com/mit-han-lab/efficientvit EfficientViT]
 +
| MIT Han Lab
 +
| [https://github.com/mit-han-lab/efficientvit?tab=Apache-2.0-1-ov-file#readme Apache 2.0]
 +
| A new family of vision models for efficient high-resolution vision
 +
|-
 +
! [https://github.com/haotian-liu/LLaVA LLaVA]
 +
| Haotian Liu
 +
| [https://github.com/haotian-liu/LLaVA/blob/main/LICENSE Apache 2.0]
 +
| Visual instruction tuning towards large language and vision models with GPT-4 level capabilities
 +
|-
 +
! [https://github.com/Vision-CAIR/MiniGPT-4 MiniGPT-4]  
 +
| Vision-CAIR
 +
| [https://github.com/Vision-CAIR/MiniGPT-4/blob/main/LICENSE.md BSD 3-Clause "New" or "Revised" License] & [https://github.com/Vision-CAIR/MiniGPT-4/blob/main/LICENSE_Lavis.md BSD 3-Clause License]
 +
| Open-sourced code for large language models as a unified interface for vision-language multi-task learning
 +
|}
  
 
=== 3D modeling ===
 
=== 3D modeling ===
* [https://github.com/NasirKhalid24/CLIP-Mesh CLIP-MESH] by NasirKhalid24 | [https://github.com/NasirKhalid24/CLIP-Mesh/blob/master/LICENSE MIT]
+
{| class="wikitable sortable"
 +
|-
 +
! Project
 +
! Credit
 +
! License
 +
! Description
 +
|-
 +
! [https://github.com/threestudio-project/threestudio threestudio]
 +
| threestudio
 +
| [https://github.com/threestudio-project/threestudio#Apache-2.0-1-ov-file Apache 2.0]
 +
| A unified framework for 3D content generation
 +
|-
 +
! [https://github.com/VAST-AI-Research/TripoSR TripoSR]
 +
| VAST-AI-Research
 +
| [https://github.com/VAST-AI-Research/TripoSR#MIT-1-ov-file MIT]
 +
| A fast feedforward 3D reconstruction from a single image. The model is MIT licensed too
 +
|}
  
 
== Audio ==
 
== Audio ==
Line 220: Line 368:
 
* [[Vosk]]
 
* [[Vosk]]
 
* [[Whisper]]
 
* [[Whisper]]
** [https://github.com/ggerganov/whisper.cpp OpenAI's Whisper model ported to C/C++] by ggerganov (Georgi Gerganov) | [https://github.com/ggerganov/whisper.cpp/blob/master/LICENSE MIT License]
 
 
==== Speech synthesis (text to speech (TTS)) ====
 
 
{| class="wikitable sortable"
 
{| class="wikitable sortable"
 
|-
 
|-
Line 228: Line 373:
 
! Credit  
 
! Credit  
 
! License
 
! License
 +
! Description
 +
|-
 +
! [https://github.com/ggerganov/whisper.cpp whisper.cpp]
 +
| ggerganov (Georgi Gerganov)
 +
| [https://github.com/ggerganov/whisper.cpp?tab=MIT-1-ov-file#readme MIT]
 +
| Port of OpenAI's Whisper model in C/C++
 +
|}
 +
==== Synthesis (text to speech (TTS)) ====
 +
{| class="wikitable sortable"
 +
|-
 +
! Project
 +
! Credit
 +
! License
 +
! Description
 
|-
 
|-
 
! [https://github.com/suno-ai/bark Bark]  
 
! [https://github.com/suno-ai/bark Bark]  
 
| Suno AI  
 
| Suno AI  
| [https://github.com/suno-ai/bark/blob/main/LICENSE MIT License]
+
| [https://github.com/suno-ai/bark/blob/main/LICENSE MIT]
 +
| Text-Prompted Generative Audio Model
 
|-
 
|-
 
! [https://github.com/coqui-ai/TTS Coqui TTS]  
 
! [https://github.com/coqui-ai/TTS Coqui TTS]  
 
| Coqui AI  
 
| Coqui AI  
 
| [https://github.com/coqui-ai/TTS/blob/dev/LICENSE.txt MPL 2.0]
 
| [https://github.com/coqui-ai/TTS/blob/dev/LICENSE.txt MPL 2.0]
 +
| A deep learning toolkit for Text-to-Speech, battle-tested in research and production
 +
|-
 +
!  [https://github.com/neonbjb/tortoise-tts TorToiSe]
 +
|  neonbjb (James Betker)
 +
| [https://github.com/neonbjb/tortoise-tts/blob/main/LICENSE Apache 2.0]
 +
| A multi-voice TTS system trained with an emphasis on quality
 +
|-
 +
! [https://github.com/collabora/WhisperSpeech?tab=readme-ov-file WhisperSpeech]
 +
| Collabora
 +
| [https://github.com/collabora/WhisperSpeech?tab=MIT-1-ov-file#readme MIT]
 +
| Created using only properly licensed speech recordings so the model & code will be always safe to use for commercial applications
 +
|}
 +
 +
==== Transmogrify (speech to speech (STS)) ====
 +
{| class="wikitable sortable"
 +
|-
 +
! Project
 +
! Credit
 +
! License
 +
! Description
 +
|-
 +
! [https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI Retrieval-based-Voice-Conversion-WebUI]
 +
| RVC-Project
 +
| [https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI#MIT-1-ov-file MIT]
 +
| Voice data <= 10 mins can also be used to train a good VC model
 
|-
 
|-
 
! [https://github.com/voicepaw/so-vits-svc-fork SoftVC VITS Singing Voice Conversion Fork]  
 
! [https://github.com/voicepaw/so-vits-svc-fork SoftVC VITS Singing Voice Conversion Fork]  
 
| voicepaw  
 
| voicepaw  
| [https://github.com/voicepaw/so-vits-svc-fork/blob/main/LICENSE Apache License 2.0 & MIT License]
+
| [https://github.com/voicepaw/so-vits-svc-fork/blob/main/LICENSE Apache 2.0 & MIT]
|-
+
| so-vits-svc fork with realtime support, improved interface and more features
!  [https://github.com/neonbjb/tortoise-tts TorToiSe]
 
|  neonbjb (James Betker)
 
| [https://github.com/neonbjb/tortoise-tts/blob/main/LICENSE Apache License 2.0]
 
 
|-
 
|-
 
|}
 
|}
  
 
=== Music ===
 
=== Music ===
 +
====Audio Diffusion====
 +
{| class="wikitable sortable"
 +
|-
 +
! Project
 +
! Credit
 +
! License
 +
! Description
 +
|-
 +
! [https://github.com/Harmonai-org Audio Diffusion]
 +
| Harmonai
 +
| MIT, MIT, MIT
 +
| A Stability AI lab focused on open-source generative audio models
 +
|}
  
* [https://github.com/fabiogra/moseca Moseca] separates music tracks into different stems (voice, drums, bass, guitar, piano, and others). This is useful for remixing, karaoke, and music studies. MIT
+
==== Music splitters ====
 
+
{| class="wikitable sortable"
 +
|-
 +
! Project
 +
! Credit
 +
! License
 +
! Description
 +
|-
 +
! [https://github.com/facebookresearch/demucs/tree/hybrid Demucs (v3)]
 +
| Meta Research
 +
| [https://github.com/facebookresearch/demucs/blob/main/LICENSE MIT]
 +
| Code for the paper Hybrid Spectrogram and Waveform Source Separation
 +
|-
 +
! [https://github.com/fabiogra/moseca Moseca]
 +
| fabiogra (Fabio Grasso)
 +
| [https://github.com/fabiogra/moseca?tab=MIT-1-ov-file#readme MIT]
 +
| A Streamilt web app for music source separation & karaoke
 +
|-
 +
! [https://github.com/deezer/spleeter spleeter]
 +
| deezer
 +
| [https://github.com/deezer/spleeter?tab=MIT-1-ov-file#readme MIT]
 +
| Deezer source separation library including pretrained models
 +
|}
 
==Uncategorized==
 
==Uncategorized==
 
* Virtual assistant: [[Mycroft]]
 
* Virtual assistant: [[Mycroft]]

Latest revision as of 10:36, 18 April 2024

Free Software Foundation-Free Software Directory-Artificial Intelligence Project Team.png

The Artificial Intelligence Team gathers free software resources regarding machine learning / artificial intelligence-related topics.


Group info User info
User Role Reference Real name libera.chat nick Time zone Title
David_Hedlund Coordinator David Hedlund David_Hedlund Europe/Stockholm
GrahamxReed Collaborator Graham Reed Graham_Reed America/New_York
Mertgor Observer Mert Gör hwpplayer1 Europe/Istanbul
Mmcmahon Team captain Michael McMahon thomzane America/New_York FSF Systems Administrator

Legal

USA

EU

  • AI Act (2024)
    • Title I, Art. 2, 5g: The obligations laid down in this Regulation shall not apply to AI systems released under free and open source licences unless they are placed on the market or put into service as high-risk AI systems or an AI system that falls under Title II and IV.
      • Title II: PROHIBITED ARTIFICIAL INTELLIGENCE PRACTICES
        • Generally mentions behavior manipulation techniques, social scores, how to approach biometric data, predicting criminal offences based solely on personality, facial databases
      • Title IV: TRANSPARENCY OBLIGATIONS
        • Generally mentions disclosure obligations that a product is AI-generated

Text

Grammar

Translation

Text generation

Front ends

Project Credit License Description
Agnai agnaistic AGPLv3 AI agnostic (multi-user and multi-bot) chat with fictional characters
GPT4All Nomic AI MIT Run open-source LLMs anywhere
KoboldAI KoboldAI AGPLv3 A browser-based front-end for AI-assisted writing with multiple local & remote AI models
koboldcpp LostRuins AGPLv3 A simple one-file way to run various GGML and GGUF models with KoboldAI's UI
ollama ollama MIT Get up and running with large language models locally
Serge serge-chat MIT A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API
SillyTavern SillyTavern AGPLv3 LLM frontend for power users
Text Generation Web UI oobabooga AGPLv3 A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), LLaMA models

Models

Project Credit License Description
llama.cpp ggerganov (Georgi Gerganov) MIT Port of Facebook's LLaMA model in C/C++
Mistral 7B Mistral AI Apache 2.0 The first dense model released by Mistral AI, perfect for experimentation, customization, and quick iteration. At the time of the release, it matched the capabilities of models up to 30B parameters
Mixtral 8x7B Mistral AI Apache 2.0 A sparse mixture of experts model. As such, it leverages up to 45B parameters but only uses about 12B during inference, leading to better inference throughput at the cost of more vRAM
Mixtral 8x22B Mistral AI Apache 2.0 A bigger sparse mixture of experts model with larger context window. As such, it leverages up to 176B parameters but only uses about 39B during inference, leading to better inference throughput at the cost of more vRAM

Honorary mention: LLaMA 1 had an AGPLv3 license.

Retrieval Augmented Generation (RAG) UI's

Project Credit License Description
Local RAG Jon Fairbanks (jonfairbanks) GPLv3 Ingest files for retrieval augmented generation (RAG) with open-source Large Language Models (LLMs), all without 3rd parties or sensitive data leaving your network

Evaluating LLMs

Project Credit License Description
FastChat lm-sys Apache-2.0 An open platform for training, serving, and evaluating large language models
promptfoo promptfoo MIT Test your prompts, models, RAGs. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality

Other

Project Credit License Description
GPT-NeoX EleutherAI Apache 2.0 EleutherAI's library for training large-scale language models on GPUs
Open Assistant LAION-AI Apache 2.0 A chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so
Open Interpreter KillianLucas AGPLv3 Lets LLMs run code (Python, Javascript, Shell, and more) locally

Code generation

This concept is controversial. See the FSF's other writing on this topic.

Images

Image generation

Text to image GUI

Project Credit License Description
ComfyUI comfyanonymous GPLv3 Modular Stable Diffusion GUI, API, and backend with a graph/nodes interface
ControlNet for Stable Diffusion WebUI Mikubill (Kakigōri Maker) GPLv3 An AUTOMATIC1111 extension that adds ControlNet to the original Stable Diffusion model to generate images
Stable Diffusion WebUI AUTOMATIC1111 AGPLv3 Stable Diffusion web UI

Additional libraries

Project Credit License Description
ControlNet lllyasviel Apache-2.0 Adding conditional control to text-to-image diffusion models
Diffusers huggingface Apache-2.0 State-of-the-art diffusion models for image and audio generation in PyTorch
stable-diffusion.cpp leejet MIT Stable Diffusion in pure C/C++

Videos

Image generation techniques create pictures from noise estimations. This noise shows up as artifacts and hampers temporal stability for objects. These projects tackle that issue.

Project Credit License Description
AnimateDiff guoyww (Yuwei Guo) Apache 2.0 A plug-and-play module turning most community models into animation generators, without the need of additional training
FILM: Frame Interpolation for Large Motion Google Research Apache 2.0 A unified single-network approach to frame interpolation that doesn't use additional pre-trained networks
MagicAnimate MagIC Research BSD-3-Clause Temporally consistent human image animation using a diffusion model
MotionCtrl ARC Lab, Tencent PCG Apache 2.0 A unified and flexible motion controller for video generation
TemporalKit CiaraStrawberry (Ciara Rowles) GPLv3 An all in one solution for adding temporal stability to a Stable Diffusion render via an AUTOMATIC1111 extension
Text To Video Synthesis Colab camenduru The Unlicense A text-to-video synthesis model that evolves from a text-to-image synthesis model
Thin-Plate Spline Motion Model for Image Animation yoyo-nb MIT Animates a static object in a source image according to a driving video

Image recognition

Project Credit License Description
EfficientViT MIT Han Lab Apache 2.0 A new family of vision models for efficient high-resolution vision
LLaVA Haotian Liu Apache 2.0 Visual instruction tuning towards large language and vision models with GPT-4 level capabilities
MiniGPT-4 Vision-CAIR BSD 3-Clause "New" or "Revised" License & BSD 3-Clause License Open-sourced code for large language models as a unified interface for vision-language multi-task learning

3D modeling

Project Credit License Description
threestudio threestudio Apache 2.0 A unified framework for 3D content generation
TripoSR VAST-AI-Research MIT A fast feedforward 3D reconstruction from a single image. The model is MIT licensed too

Audio

Natural language processing (NLP)

Transcription (Speech to text (STT))

Project Credit License Description
whisper.cpp ggerganov (Georgi Gerganov) MIT Port of OpenAI's Whisper model in C/C++

Synthesis (text to speech (TTS))

Project Credit License Description
Bark Suno AI MIT Text-Prompted Generative Audio Model
Coqui TTS Coqui AI MPL 2.0 A deep learning toolkit for Text-to-Speech, battle-tested in research and production
TorToiSe neonbjb (James Betker) Apache 2.0 A multi-voice TTS system trained with an emphasis on quality
WhisperSpeech Collabora MIT Created using only properly licensed speech recordings so the model & code will be always safe to use for commercial applications

Transmogrify (speech to speech (STS))

Project Credit License Description
Retrieval-based-Voice-Conversion-WebUI RVC-Project MIT Voice data <= 10 mins can also be used to train a good VC model
SoftVC VITS Singing Voice Conversion Fork voicepaw Apache 2.0 & MIT so-vits-svc fork with realtime support, improved interface and more features

Music

Audio Diffusion

Project Credit License Description
Audio Diffusion Harmonai MIT, MIT, MIT A Stability AI lab focused on open-source generative audio models

Music splitters

Project Credit License Description
Demucs (v3) Meta Research MIT Code for the paper Hybrid Spectrogram and Waveform Source Separation
Moseca fabiogra (Fabio Grasso) MIT A Streamilt web app for music source separation & karaoke
spleeter deezer MIT Deezer source separation library including pretrained models

Uncategorized



Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the page “GNU Free Documentation License”.

The copyright and license notices on this page only apply to the text on this page. Any software or copyright-licenses or other similar notices described in this text has its own copyright notice and license, which can usually be found in the distribution or license text itself.