Recent comments in /f/MachineLearning

VelveteenAmbush t1_jcbw6mx wrote

DeepMind's leaders would love to hoard their secrets. The reason they don't is that it would make them a dead end for the careers of their research scientists -- because aside from the occasional public spectacle (AlphaGo vs. Lee Sedol) nothing would ever see the light of day. If they stopped publishing, they'd hemorrhage talent and die.

OpenAI doesn't have this dilemma because they actually commercialize their cutting-edge research. Commercializing its research makes its capabilities apparent to everyone, and being involved in its creation advances your career even without a paper on Arxiv.

61

omniron t1_jcbvup9 wrote

All research gets used for productive entrepreneurial purposes. OpenAI is just kind of sad that they started with the mission of being open literally in their name, and now are going the opposite direction.

Google will eat their lunch though. Google has the Worlds largest collection of video and that’s the final frontier of large transformer network Ai.

10

crt09 t1_jcbv608 wrote

> Alpaca couldn't be commercial because openai thinks it can forbid usage of outputs from their model to train competing models.

I dont think they claimed this anywhere? It seems that the only reason for Alpaca not releasing weights is Meta's policy for releasing Llama weights.

https://crfm.stanford.edu/2023/03/13/alpaca.html

> We have reached out to Meta to obtain guidance on releasing the Alpaca model weights, both for the 7B Alpaca and for fine-tuned versions of the larger LLaMA models.

Plus they already released the data they got from the GPT API, so anyone who has Llama 7B; an ability to implement the finetuning code in Alpaca; and 100 bucks can replicate it.

(EDIT: they released the code. now all you need is a willingness to torrent Llama 7B and 100 bucks)

2

VelveteenAmbush t1_jcbv0rs wrote

GPT-4 is an actual commercial product though. AlphaGo was just a research project. No sane company is going to treat the proprietary technological innovations at the core of their commercial strategy as an intellectual commons. It's like asking them to give away the keys to the kingdom.

−2

VelveteenAmbush t1_jcbu8nr wrote

> While they also potentially don't release every model (see Google's PaLM, LaMDA) or only with non-commercial licenses after request (see Meta's OPT, LLaMA), they are at least very transparent when it comes to ideas, architectures, trainings, and so on.

They do this because they don't ship. If you're a research scientist or ML research engineer, publication is the only way to advance your career at a company like that. Nothing else would ever see the light of day. It's basically a better funded version of academia, because it doesn't seem to be set up to actually create and ship products.

Whereas if you can say "worked at OpenAI from 2018-2023, team of 5 researchers that built GPT-4 architecture" or whatever, that speaks for itself. The products you release and the role you had on the teams that built them are enough to build a resume -- and probably a more valuable resume at that.

14

Chuyito t1_jcbu40y wrote

1, We are about to see a new push for a "robots.txt" equivalent for training data. E.g If yelp had a "datarules.txt file indicating no training on its comments for private use. Idea being that you could specify a license which allows training on your data for open source, but not for profit. Benefit for yelp is similar to the original Netflix training data set we all used at some point.

2, Its going to create a massive push for open frameworks. I can see nvda going down the path of "Appliances" similar to what IBM and many tech companies did for servers with pre-installed software. Many of those were open-source software, configured and ready to use/tune to your app. If you want to adjust the weight on certain bias filters, but not write the model from scratch.. Having an in house instance of your "assistant" will be favorable to many (E.g. if you are doing research on bioFuels, chatGpt will sensor way too much in trying to push "green", and lose track of research in favor of policy.)

27

Alimbiquated t1_jcbspbs wrote

This kind of model needs vastly more input data than the human brain does to learn. It doesn't make sense to compare the two.

For example, Chat GPT is trained on 570 GB of data comprising 300 billion words.

https://analyticsindiamag.com/behind-chatgpts-wisdom-300-bn-words-570-gb-data/

If a baby heard one word a second, it would take nearly 10,000 years to learn the way Chat GPT did. But babies only need a few years and hear words at a much lower average rate.

So these models don't undermine the claim of innateness at all.

7

RareMajority t1_jcbqi56 wrote

I'm not referring to OpenAI here. Meta released the weights to Llama and now anyone can build an AI based on that model for any purpose and without any attempt at alignment. Maybe there's middle ground between the two approaches.

4

satireplusplus t1_jcbq2ik wrote

> most notably dropout.

Probably unenforable and math shouldn't be patentable. Might as well try to patent matrix multiplications (I'm sure someone tried). Also dropout isn't even complex math. It's an elementwise multiplication with randomized 1's and 0's, thats all it is.

17

amhotw t1_jcbpd9z wrote

I understand that. I am pointing out the fact that they started on different paths. One of them was actually matching its name with what it was doing; the other was a contradiction from the beginning.

Edit: Wow, people either can't read or don't read enough history.

−4