Difference between revisions of "Free Software Directory talk:Artificial Intelligence Team"

From Free Software Directory
Jump to: navigation, search
(adding more descriptive summary for copyright guidance)
m (Potential Freedom issues: formatting for easier read)
Line 17: Line 17:
 
* The training data often contains non-free licensed material.
 
* The training data often contains non-free licensed material.
 
** According to current copyright laws, this does not impact the license of the model or the output of the model. According to current copyright laws, the output is public domain. [[User:Mmcmahon|Mmcmahon]] ([[User talk:Mmcmahon|talk]]) 11:48, 2 May 2023 (EDT)
 
** According to current copyright laws, this does not impact the license of the model or the output of the model. According to current copyright laws, the output is public domain. [[User:Mmcmahon|Mmcmahon]] ([[User talk:Mmcmahon|talk]]) 11:48, 2 May 2023 (EDT)
*** [https://www.copyright.gov/ai/ai_policy_guidance.pdf A source for USA AI copyright guidance (Mar 16 '23)] -
+
 
*** '''Purely generated AI content is not copyrightable''': "For example, when an AI technology receives solely a prompt[27] from a human and produces complex written, visual, or musical works in response, the “traditional elements of authorship” are determined and executed by the technology—not the human user."
+
[https://www.copyright.gov/ai/ai_policy_guidance.pdf USA copyright AI policy guidance (Mar 16 '23)]
*** '''Only the human-generated elements of modifying/arranging AI output are copyrightable''': "a human may select or arrange AI-generated material in a sufficiently creative way that “the resulting work as a whole constitutes an original work of authorship.”[33] Or an artist may modify material originally generated by AI technology to such a degree that the modifications meet the standard for copyright protection.[34] In these cases, copyright will only protect the human-authored aspects of the work, which are “independent of ” and do “not affect” the copyright status of the AI-generated material itself.[35]" [[User:GrahamxReed|GrahamxReed]] ([[User talk:GrahamxReed|talk]]) 22:54, 14 May 2023 (EDT)
+
 
 +
'''Purely generated AI content is not copyrightable'''
 +
"For example, when an AI technology receives solely a prompt[27] from a human and produces complex written, visual, or musical works in response, the “traditional elements of authorship” are determined and executed by the technology—not the human user."
 +
'''Only the human-generated elements of modifying/arranging AI output are copyrightable'''
 +
"a human may select or arrange AI-generated material in a sufficiently creative way that “the resulting work as a whole constitutes an original work of authorship.”[33] Or an artist may modify material originally generated by AI technology to such a degree that the modifications meet the standard for copyright protection.[34] In these cases, copyright will only protect the human-authored aspects of the work, which are “independent of ” and do “not affect” the copyright status of the AI-generated material itself.[35]"
 +
- [[User:GrahamxReed|GrahamxReed]] ([[User talk:GrahamxReed|talk]]) 23:00, 14 May 2023 (EDT)
  
 
== Testing model viability ==
 
== Testing model viability ==

Revision as of 23:00, 14 May 2023

Free software replacements that are missing

  • AI Research Assistant
    • https://elicit.org/ - Elicit uses language models to help you automate research workflows, like parts of literature review.
  • Voice to instrument: Tone Transfer-like
  • Identification
    • Photo
    • Audio
      • Shazam: Shazam is an application that can identify music, movies, advertising, and television shows, based on a short sample played and using the microphone on the device.
      • A free app that functions like midomi.com -- "You can find songs with midomi and your own voice. Forgot the name of a song? Heard a bit of one on the radio? All you need is your computer's microphone."
  • http://design.rxnfinder.org/addictedchem/prediction/

Potential Freedom issues

  • Dependencies need to be checked.
  • Verify whether a workflow requires non-free GPU or if CPU can be used.
  • The training data often contains non-free licensed material.
    • According to current copyright laws, this does not impact the license of the model or the output of the model. According to current copyright laws, the output is public domain. Mmcmahon (talk) 11:48, 2 May 2023 (EDT)

USA copyright AI policy guidance (Mar 16 '23)

Purely generated AI content is not copyrightable

"For example, when an AI technology receives solely a prompt[27] from a human and produces complex written, visual, or musical works in response, the “traditional elements of authorship” are determined and executed by the technology—not the human user."

Only the human-generated elements of modifying/arranging AI output are copyrightable

"a human may select or arrange AI-generated material in a sufficiently creative way that “the resulting work as a whole constitutes an original work of authorship.”[33] Or an artist may modify material originally generated by AI technology to such a degree that the modifications meet the standard for copyright protection.[34] In these cases, copyright will only protect the human-authored aspects of the work, which are “independent of ” and do “not affect” the copyright status of the AI-generated material itself.[35]"

- GrahamxReed (talk) 23:00, 14 May 2023 (EDT)

Testing model viability

Tools are needed to assess the pros/cons of each model.

HuggingFace's Open LLM leaderboard

Ordinal value scales could exist for

Source of model training data

  • amount of data
  • date range (e.g. distinguishing old science from new science for smaller scale models)
  • level of censorship (important to make personal+research use distinct from business use)

Problem solving

  • math
  • creative problem solving (there exists methodology for testing this in humans)

General trends

  • Larger models are more prone to human superstition[1], but also generate more human-like readability.
  • Quantization (a la GPT-Q) allows consumer hardware to run large models.

Extra information

Stable Diffusion

Stable Diffusion model files (.ckpt) are released under a non-free license.

Here's the stable diffusion beginning point: https://huggingface.co/CompVis/stable-diffusion-v1-4 https://huggingface.co/spaces/CompVis/stable-diffusion-license

stable-diffusion-webui

Large Language Models

LLaMa

  • LLaMa is released under the GNU General Public License v3.0: https://github.com/facebookresearch/llama/blob/main/LICENSE
  • LLaMa is comparable to GPT-3, and has been fully released as a torrent and on huggingface.
  • 7B paramenter model has a VRAM requirement of 10GB. The 13B model has a requirement of 20GB, 30B needs 40GB, and 65B needs 80GB.

Unknown license but still noteworthy

External links



Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the page “GNU Free Documentation License”.

The copyright and license notices on this page only apply to the text on this page. Any software or copyright-licenses or other similar notices described in this text has its own copyright notice and license, which can usually be found in the distribution or license text itself.