Recent comments in /f/MachineLearning

camp4climber t1_jcbomzj wrote

Generally it would be unfair to claim that you beat benchmark results if you train for 8x more epochs than other methods. Benchmarks exist to ensure that methods are on a somewhat level playing field. There's certainly some wiggle room depending on the task, but in this case I don't believe that a lower learning rate and more epochs is novel or interesting enough to warrant a full paper.

It's not to say that the work is not worth anything though! There may be a paper in there somewhere if you can further explore some theoretical narratives specific to why that would be the case. Perhaps you can make comparisons to the large models where the total number of FLOPs are fixed. In that case a claim of using a smaller model with more epochs is more efficient than a larger model with fewer epochs would be interesting.

For what it's worth, the optimizer settings in the VGAE paper do not appear to be tuned. I imagine you could improve on their results in much fewer than 1500 epochs by implementing some simple stuff like learning rate decay.

10

Nhabls t1_jcbnr3g wrote

That's alpaca, a finetuning on llama and you're just pointing to another of openai's shameless behaviours. Alpaca couldn't be commercial because openai thinks it can forbid usage of outputs from their model to train competing models. Meanwhile they also argue that they can take whatever and any and all copyrighted data from the internet with no permission or compensation needed.

They think they can have it both ways, at this point i'm 100% rooting for them to get screwed as hard as possible in court on their contradiction

19

Nhabls t1_jcbmn7g wrote

> If we want everything to be open sourced then chatgpt as it is now probably wouldn't be possible at all

All of the technology concepts behind chatGPT are openly accessible and have been for the past decade, as was the work before, a lot of it came from big tech companies that work for profit, the profit motive is not an excuse. Only unprecedented greed in the space.

Though it comes to no surprise from the company that thinks it can just take any copyrighted data from the internet without any permission while at the same time forbid others from training models from data they get from the company's products. It's just sleaziness at every level.

>But anyway I think basic theoretical breakthroughs like a new architecture for AI will still be shared among academia since those aren't directly related to money

This is exactly what hasn't happened, they refused outright to share any architectural detail, no one was expecting the weights or even code. This is what people are upset about, and rightly so

5

bartturner t1_jcbmg57 wrote

> That's not how capitalism works.

Totally get that it makes no business sense that Google gives away so much stuff. Look at Android. They let Amazon use it for all their stuff.

But I still love it. I wish more companies rolled like Google. They feel like lifting all boats also lifts theirs.

Google being the AI leader for the last decade plus they have set a way of doing things.

OpenAI is not doing the same and that really sucks. I hope the others will not follow the approach by OpenAI and instead continue to roll like they have.

−1

Nhabls t1_jcbm504 wrote

> Google Deepmind did go that route of secrecy with AlphaGo

AlphaGo had a proper paper released, what are you talking about?

This action by OpenAI to completely refuse to share their procedure for training GPT-4 very much breaks precedent and is horrible for the field as a whole. It shouldn't be glossed over

14

-Rizhiy- t1_jcblvqs wrote

This is a moot point. Most companies use AI research without contributing back, that is what being a business generally is, nothing new here.

They just need to admit that they are a business now and want to earn money for their own benefit, rather than "benefit all of humanity". Changing the name would be a good idea too)

11

I_will_delete_myself t1_jcblldw wrote

People can't do deep learning or AI without the tools to make them happen. Imagine how complicated the data collection methods at the scale of Terabytes of data and cleaning it. People also need to annotate data for it to work which requires software to get it done in a cost effective manner.

9

RareMajority t1_jcbku9g wrote

Is it a good thing though for companies to be open-sourcing things like their weights? If there's enough knowledge in the open to build powerful un-aligned AIs that seems rather dangerous to me. I definitely don't want anyone to be able to build their own AGI to use for their own reasons.

−10

Eaklony t1_jcbkqgk wrote

That's not how capitalism works. To produce chatgpt they need a lot of money for a huge GPU farm, which needs to be invested by people who expect profits from it. If we want everything to be open sourced then chatgpt as it is now probably wouldn't be possible at all. But anyway I think basic theoretical breakthroughs like a new architecture for AI will still be shared among academia since those aren't directly related to money. Hopefully, it would just be the detailed implementation of actual products that aren't open source.

−3

MrTacobeans t1_jcbj1kw wrote

This is coming at a total layman's point of view that follows ai Schmutz pretty closely but anyway...

Wouldn't running a tighter learning variable and longer epoch length reduce many of the benefits of a NN outside of a synthetic benchmark?

From what I know a NN can be loosely trained and helpfully "hallucinate" the gaps it doesn't know and still be useful. When the network is constricted it might be extremely accurate and smaller than the loose model but the intrinsically useful/good hallucinations will be lost and things outside the benchmark will hallucinate worse than the loose model.

I give props to AI engineers this all seems like an incredibly delicate balance and probably why massive amounts of data is needed to prevent either side of this situation.

I feel like there is no need to enforce a epoch/learning curve in benchmarks because usually models converge to their best versions at different points regardless of the data used and if they are making a paper they likely tweaked something that was worth training and writing about beyond beating a benchmark

−7

ScientiaEtVeritas t1_jcbiupk wrote

It's not only about the model releases, but also the research details. With them, others can replicate the results, and improve on them and that might also lead to more commercial products and open-sourced models that have a less restrictive license. In general, AI progress is certainly the fastest when everyone shares their findings. On the other hand, with keeping and patenting them, you actively hinder progress.

9

bartturner t1_jcbh70h wrote

But it has been up to this point. ChatGPT is based on a technology breakthrough by Google.

There should be strong push back on OpenAI behavior. Otherwise we might end up with Google and others now sharing their incredible breakthroughs.

11