Difference between revisions of "Free Software Directory:Artificial Intelligence Team"

From Free Software Directory
Jump to: navigation, search
m (removing "license" from a few entries on tables due to it being implied by the header)
m (Audio: separating out the speech to speech projects)
(One intermediate revision by the same user not shown)
Line 232: Line 232:
 
| [https://github.com/huggingface/diffusers/blob/main/LICENSE Apache-2.0]
 
| [https://github.com/huggingface/diffusers/blob/main/LICENSE Apache-2.0]
 
| State-of-the-art diffusion models for image and audio generation in PyTorch
 
| State-of-the-art diffusion models for image and audio generation in PyTorch
 +
|-
 +
! [https://github.com/leejet/stable-diffusion.cpp stable-diffusion.cpp]
 +
| leejet
 +
| [https://github.com/leejet/stable-diffusion.cpp?tab=MIT-1-ov-file#readme MIT]
 +
| Stable Diffusion in pure C/C++
 
|-
 
|-
 
|}
 
|}
Line 313: Line 318:
 
** [https://github.com/ggerganov/whisper.cpp OpenAI's Whisper model ported to C/C++] by ggerganov (Georgi Gerganov) | [https://github.com/ggerganov/whisper.cpp/blob/master/LICENSE MIT License]
 
** [https://github.com/ggerganov/whisper.cpp OpenAI's Whisper model ported to C/C++] by ggerganov (Georgi Gerganov) | [https://github.com/ggerganov/whisper.cpp/blob/master/LICENSE MIT License]
  
==== Speech synthesis (text to speech (TTS)) ====
+
==== Synthesis (text to speech (TTS)) ====
 
{| class="wikitable sortable"
 
{| class="wikitable sortable"
 
|-
 
|-
Line 319: Line 324:
 
! Credit  
 
! Credit  
 
! License
 
! License
 +
! Description
 
|-
 
|-
 
! [https://github.com/suno-ai/bark Bark]  
 
! [https://github.com/suno-ai/bark Bark]  
 
| Suno AI  
 
| Suno AI  
 
| [https://github.com/suno-ai/bark/blob/main/LICENSE MIT]
 
| [https://github.com/suno-ai/bark/blob/main/LICENSE MIT]
 +
| Text-Prompted Generative Audio Model
 
|-
 
|-
 
! [https://github.com/coqui-ai/TTS Coqui TTS]  
 
! [https://github.com/coqui-ai/TTS Coqui TTS]  
 
| Coqui AI  
 
| Coqui AI  
 
| [https://github.com/coqui-ai/TTS/blob/dev/LICENSE.txt MPL 2.0]
 
| [https://github.com/coqui-ai/TTS/blob/dev/LICENSE.txt MPL 2.0]
 +
| 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
 +
|-
 +
!  [https://github.com/neonbjb/tortoise-tts TorToiSe]
 +
|  neonbjb (James Betker)
 +
| [https://github.com/neonbjb/tortoise-tts/blob/main/LICENSE Apache 2.0]
 +
| A multi-voice TTS system trained with an emphasis on quality
 +
|-
 +
|}
 +
 +
==== Transmogrify (speech to speech (STS)) ====
 +
{| class="wikitable sortable"
 +
|-
 +
! Project
 +
! Credit
 +
! License
 +
! Description
 
|-
 
|-
 
! [https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI Retrieval-based-Voice-Conversion-WebUI]
 
! [https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI Retrieval-based-Voice-Conversion-WebUI]
 
| RVC-Project
 
| RVC-Project
 
| [https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI#MIT-1-ov-file MIT]
 
| [https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI#MIT-1-ov-file MIT]
 +
| Voice data <= 10 mins can also be used to train a good VC model!
 
|-
 
|-
 
! [https://github.com/voicepaw/so-vits-svc-fork SoftVC VITS Singing Voice Conversion Fork]  
 
! [https://github.com/voicepaw/so-vits-svc-fork SoftVC VITS Singing Voice Conversion Fork]  
 
| voicepaw  
 
| voicepaw  
 
| [https://github.com/voicepaw/so-vits-svc-fork/blob/main/LICENSE Apache 2.0 & MIT]
 
| [https://github.com/voicepaw/so-vits-svc-fork/blob/main/LICENSE Apache 2.0 & MIT]
|-
+
| so-vits-svc fork with realtime support, improved interface and more features.
!  [https://github.com/neonbjb/tortoise-tts TorToiSe]
 
|  neonbjb (James Betker)
 
| [https://github.com/neonbjb/tortoise-tts/blob/main/LICENSE Apache 2.0]
 
 
|-
 
|-
 
|}
 
|}

Revision as of 23:23, 4 March 2024

Free Software Foundation-Free Software Directory-Artificial Intelligence Project Team.png

The Artificial Intelligence Project Team gathers free software resources regarding machine learning / artificial intelligence.

Group info User info
User Role Reference Real name libera.chat nick Time zone Title
David_Hedlund Coordinator David Hedlund David_Hedlund Europe/Stockholm
GrahamxReed Collaborator Graham Reed Graham_Reed America/New_York
Mertgor Observer Mert Gör hwpplayer1 Europe/Istanbul
Mmcmahon Team captain Michael McMahon thomzane America/New_York FSF Systems Administrator

Truthfulness

Transformers

Text

Grammar

Translation

Text generation

Front ends

Project Credit License Description
Agnai agnaistic AGPLv3 AI agnostic (multi-user and multi-bot) chat with fictional characters
ExLlamaV2 turboderp MIT A fast inference library for running LLMs locally on modern consumer-class GPUs
ExLlamaV2 WebUI turboderp MIT
GPT4All Nomic AI MIT Run open-source LLMs anywhere
KoboldAI KoboldAI AGPLv3 A browser-based front-end for AI-assisted writing with multiple local & remote AI models
koboldcpp LostRuins AGPLv3 A simple one-file way to run various GGML and GGUF models with KoboldAI's UI
Serge serge-chat MIT A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.
SillyTavern SillyTavern AGPLv3 LLM frontend for power users
Text Generation Web UI oobabooga AGPLv3 A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), LLaMA models

Models

Project Credit License Description
llama.cpp ggerganov (Georgi Gerganov) MIT Port of Facebook's LLaMA model in C/C++
Mistral Mistral AI Apache 2.0 Mistral 7B significantly outperforms Llama 2 13B on all metrics, and is on par with Llama 34B.
Mixtral Mistral AI Apache 2.0 Mixtral outperforms Llama 2 70B on most benchmarks with 6x faster inference.

Other

Project Credit License Description
GPT-NeoX EleutherAI Apache 2.0 EleutherAI's library for training large-scale language models on GPUs
FastChat lm-sys Apache-2.0 An open platform for training, serving, and evaluating large language models
Open Assistant LAION-AI Apache 2.0 A chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so
Open Interpreter KillianLucas AGPLv3 Lets LLMs run code (Python, Javascript, Shell, and more) locally

Honorary mention: LLaMA 1 had an AGPLv3 license.

Code generation

This concept is controversial. See the FSF's other writing on this topic.

Images

Image generation

Text to image GUI

Project Credit License Description
ComfyUI comfyanonymous GPLv3 Modular Stable Diffusion GUI, API, and backend with a graph/nodes interface
ControlNet for Stable Diffusion WebUI Mikubill (Kakigōri Maker) GPLv3 An AUTOMATIC1111 extension that adds ControlNet to the original Stable Diffusion model to generate images
Stable Diffusion WebUI AUTOMATIC1111 AGPLv3 Stable Diffusion web UI

Additional libraries

Project Credit License Description
ControlNet lllyasviel Apache-2.0 Adding conditional control to text-to-image diffusion models
Diffusers huggingface Apache-2.0 State-of-the-art diffusion models for image and audio generation in PyTorch
stable-diffusion.cpp leejet MIT Stable Diffusion in pure C/C++

Legacy

Project Credit License Description
DALL-E Mini borisdayma (Boris Dayma) Apache 2.0 Generate images from a text prompt
neural-style anishathalye GPLv3 An implementation of neural style in TensorFlow

Videos

Image generation techniques create pictures from noise estimations. This noise shows up as artifacts and hampers temporal stability for objects. These projects tackle that issue.

Project Credit License Description
FILM: Frame Interpolation for Large Motion Google Research Apache 2.0 A unified single-network approach to frame interpolation that doesn't use additional pre-trained networks
MagicAnimate MagIC Research BSD-3-Clause Temporally consistent human image animation using a diffusion model
MotionCtrl ARC Lab, Tencent PCG Apache 2.0 A unified and flexible motion controller for video generation
TemporalKit CiaraStrawberry (Ciara Rowles) GPLv3 An all in one solution for adding temporal stability to a Stable Diffusion render via an AUTOMATIC1111 extension
Text To Video Synthesis Colab camenduru The Unlicense A text-to-video synthesis model that evolves from a text-to-image synthesis model
Thin-Plate Spline Motion Model for Image Animation yoyo-nb MIT Animates a static object in a source image according to a driving video

Image captioning

3D modeling

Audio

Natural language processing (NLP)

Transcription (Speech to text (STT))

Synthesis (text to speech (TTS))

Project Credit License Description
Bark Suno AI MIT Text-Prompted Generative Audio Model
Coqui TTS Coqui AI MPL 2.0 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
TorToiSe neonbjb (James Betker) Apache 2.0 A multi-voice TTS system trained with an emphasis on quality

Transmogrify (speech to speech (STS))

Project Credit License Description
Retrieval-based-Voice-Conversion-WebUI RVC-Project MIT Voice data <= 10 mins can also be used to train a good VC model!
SoftVC VITS Singing Voice Conversion Fork voicepaw Apache 2.0 & MIT so-vits-svc fork with realtime support, improved interface and more features.

Music

  • Moseca separates music tracks into different stems (voice, drums, bass, guitar, piano, and others). This is useful for remixing, karaoke, and music studies. MIT

Uncategorized



Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the page “GNU Free Documentation License”.

The copyright and license notices on this page only apply to the text on this page. Any software or copyright-licenses or other similar notices described in this text has its own copyright notice and license, which can usually be found in the distribution or license text itself.