Recent comments in /f/MachineLearning
VelveteenAmbush t1_jcd6opg wrote
Reply to comment by twilight-actual in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
You could patent your algorithm and offer some sort of GPL-like patent license, but no one respects software patents anyway (for good reason IMO) and you'd be viewed as a patent troll if you tried to sue to enforce it.
GPL itself is a copyright license and does you no good if OpenAI is using your ideas but not your code. (Plus you'd actually want AGPL to force code release for an API-gated service, but that's a separate issue.)
VelveteenAmbush t1_jcd6bkq wrote
Reply to comment by sobe86 in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
I think Hassabis' goal is to build a synthetic god and reshape the cosmos, and open research isn't necessary conducive to that except as needed to keep researchers motivated and engaged.
bert0ld0 t1_jcd5w2f wrote
Reply to comment by ScientiaEtVeritas in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
From all of this I wonder where is Apple, did they completely missed the boat?
noiseinvacuum t1_jcd41kc wrote
Reply to comment by RareMajority in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
From a research perspective, imo, it's 1000x better to follow the Meta model of release code + architecture openly and share weights with researchers than to be completely closed and call yourself Open. I understand that there are genuine risks with weights been available to adversaries but I think it's still better for the progress of the very young field of AI.
suflaj t1_jcd4131 wrote
Reply to comment by E_Snap in [Discussion] What happened to r/NaturalLanguageProcessing ? by MadNietzsche
> I’ve been using ChatGPT to write all of my sales emails for difficult clients lately, and it has been fantastic. It took what should have been another staffmember at my company and made it into a proofreading duty I can handle while working on other things.
I fail to see the point you're making.
> Also… hate to say it, but the fact that you’re using the words “humiliated” and “jailbroken” in this context doesn’t exactly cast a very good light on your understanding of the situation.
I also fail to see what you're saying. How else would you describe events in which you show how stupid ChatGPT actually is and those where you get to trick it to bypass all security filters?
Alimbiquated t1_jcd2z4g wrote
Reply to comment by harharveryfunny in Modern language models refute Chomsky’s approach to language [R] by No_Draft4778
I agree that comparing these learning processes to brains is bogus.
There is a general tendency to assume that if something seems intelligent, it must be like a human brain. It's like assuming that because it's fast, a car must have legs like a horse and eat oats.
noiseinvacuum t1_jcd2xzt wrote
Reply to comment by ScientiaEtVeritas in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
Completely agree with you on this. This will get much worse IMO, specially with big investment from Microsoft in OpenAI and the fact that MS is now openly and directly challenging Google. This whole AI Alpha aggressive posturing from Satya Nadella has put Google in a difficult spot, I can't see how Google will continue to justify sharing their research openly to its investors.
LegacyAngel t1_jcd1idr wrote
Reply to comment by Purplekeyboard in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
>But without OpenAI, who would have spent the billions of dollars they have burned through creating and then actually giving people access to models like GPT-3 and now GPT-4?
Other companies are providing access. OpenAI is just being reckless.
usual disclaimer here
E_Snap t1_jcd16ks wrote
Reply to comment by suflaj in [Discussion] What happened to r/NaturalLanguageProcessing ? by MadNietzsche
I’ve been using ChatGPT to write all of my sales emails for difficult clients lately, and it has been fantastic. It took what should have been another staffmember at my company and made it into a proofreading duty I can handle while working on other things.
Also… hate to say it, but the fact that you’re using the words “humiliated” and “jailbroken” in this context doesn’t exactly cast a very good light on your understanding of the situation.
twilight-actual t1_jcd0wcs wrote
Reply to comment by bartturner in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
What exactly would that pushback be? Boycott? Post mean things?
About the only thing that could potentially prevent this is if the algorithms that we put into the public domain are protected by a license like the GPL, or something similar.
I haven't been following code releases, so I don't know if that's being done. And to be honest, I doubt most of the information flow is going by code. Rather, it's in the papers.
Is there a way to protect papers by a "GPL"? I honestly doubt it, because at that level, we're dealing strictly with ideas. And the only way to protect an idea is to patent them.
Perhaps the community, as a whole, should start patenting all their ideas, and then assigning the patents to a public trust that ensures that any derivative technology is published freely, too, under the same patent type.
HyperModerate t1_jcd0lnn wrote
Reply to comment by Nhabls in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
The way AI is used to launder copyright and licensing is concerning. Copyrighted data is used to train a model. The model’s output, now also licensed, is used to finetune a second model, also separately licensed. Finally, this highly licensed model is considered for public release.
The attitude is basically the same as a pirating but there is no similar legal precedent.
To be clear, I think AI research should be open.
sam__izdat t1_jccyxl4 wrote
Reply to [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
As a spectator, it's the standard story that's played out a million times now. I see ML as pre-scientific. If capital is allowed to take full control and call all the shots, it's not moving past that "pre" any time soon. It'll be a digital Taylorist discipline for PR industry surveillance and optimizing Amazon packers' pee breaks, and the brief flurry of actually useful progress is probably done.
[deleted] OP t1_jccyrl6 wrote
Competitive-Rub-1958 t1_jccyreq wrote
Reply to [N] PyTorch 2.0: Our next generation release that is faster, more Pythonic and Dynamic as ever by [deleted]
I think I may be reading things wrong here, but FlashAttention is only for calculating basic scaled QKV attention, not embedded inside their MHA module?
camp4climber t1_jccy0qa wrote
Reply to comment by farmingvillein in [D] Is there an expectation that epochs/learning rates should be kept the same between benchmark experiments? by TheWittyScreenName
Yea that's a fair point. These kind of examples certainly exist and often come from large research labs at the very edge of state of the art where the interesting narrative point is scale. The context of specific benchmarks or applications certainly matters.
I still think my point stands in the general case. At least for most of us independent researchers. Ultimately research is about revealing novel insights. Train for longer is not that interesting. But an LLM that fits onto a single GPU, contains 13B parameters, and is capable of outperforming a 175B parameter model is certainly interesting.
haljm t1_jccw6ef wrote
Reply to comment by nopainnogain5 in [D] To those of you who quit machine learning, what do you do now? by nopainnogain5
I'm probably not the most qualified since I'm a PhD student, but in my experience it's generally an ML position where the description requires you to have some domain knowledge or a position title in the application domain that specifies that you're doing ML.
nopainnogain5 OP t1_jccvvcj wrote
Reply to comment by darkshenron in [D] To those of you who quit machine learning, what do you do now? by nopainnogain5
What would you say are the key skills I should sharpen to work as a data engineer? (ofc I can google it, but I'm curious how it looks like from your experence)
[deleted] t1_jccvpth wrote
Reply to comment by darkshenron in [D] To those of you who quit machine learning, what do you do now? by nopainnogain5
[deleted]
Daos-Lies t1_jcctn72 wrote
Reply to comment by camp4climber in [D] Is there an expectation that epochs/learning rates should be kept the same between benchmark experiments? by TheWittyScreenName
Could I pick you up on your point about it not being interesting enough for a paper.
​
A comprehensive and properly conducted hyperparameter sweep of a selection of state of the art models would provide useful information to the community at large. It would be useful to have the knowledge of what settings are ideal to train any particular model architecture (or checkpoint of that architecture) for any particular type of dataset.
​
There would be variation in the exact hyperparameters that are best for training on the particular dataset of cat pictures the paper used, rather than your own dataset of cat pictures, but the best hyperparameters for any set of cat pictures, on that particular model, are probably going to be quite similar.
And so it is useful to have that knowledge, presented in this hypothetical paper, to refer to when you start training a model on cat pictures.
​
---
​
I have a tendency to treat epochs and learning rate like an accelerator on a car, pumping them up when you want to go faster and bringing them down when you want more control and the ability to check where you're going so you don't miss your exit.
​
Whereas a hyperparameter like C with an svm, I'm much more likely to actually bother with formally looping through and finding the 'right' C than just trying some values and going for it.
​
And the key point there is that SVMs tend to train much much faster than NNs, so I'm not bothering to take the massive extra time it would take to find the 'right' epoch and learning rate. (also epoch and LR are quite intuitive in what they actually mean, which does make them a bit easier to guess at)
​
But if someone had already put the effort in to find the 'right' epoch and LR, even if I was aware that they'd only be approximately 'right' for my particular problem, I'd definitely use that as my starting point.
​
---
​
Ok and I've written quite a lot here already, but I'm going to end by mentioning that in the paper that accompanied the GPT-4 release, they had a whole section on predicting the loss that would be achieved at a certain point in GPT-4s training procedure. Because when you get to training on that scale its pretty costly to guess at your training procedure and so any metrics you have at all on how to get it right the first time are valuable just in terms of the cost of compute time.
​
So yes u/TheWittyScreenName it is worth a paper, and my recommendation would be to have it focused around your conducting a solid and systematic analysis to present.
​
Edit: Well gosh, I've just reread your comment u/camp4climber and you are basically saying the same thing. But maybe my fleshing it out is useful for OP.
darkshenron t1_jcctjv4 wrote
A lot of NLP specialists are having an existential crisis since chatgpt and now gpt4 have shown to solve pretty much any problem you throw at it. Some of us have used our knowledge of productising large language models to move into MLOps and data engineering where the outcome is more deterministic
neato5000 t1_jccszzz wrote
I've had jobs that were similar to what you describe. My current job contains less by way of tiny tweaks to massive DL models and more feature engineering and engineering in general which suits me better.
My slightly warm take is that DL at the coal face in industry feels very random, very time consuming, and as a result a bit demoralising. More power to you if you have the knack for it, and enjoy it, it's just not super my bag.
suflaj t1_jccsipl wrote
Reply to comment by E_Snap in [Discussion] What happened to r/NaturalLanguageProcessing ? by MadNietzsche
You mean the same type of foresight with GPT3, when people (or rather "people", given that it was mostly journalists) got baited into spreading hysteria over the authors claims that the technology is world ending? Or ChatGPT, which was humiliated and jailbroken within 36 hours of its public release?
It has been a day now, and I've heard the same concerns that it's ultimately biased. Definitely not career-ending.
E_Snap t1_jccs0vy wrote
Reply to comment by suflaj in [Discussion] What happened to r/NaturalLanguageProcessing ? by MadNietzsche
I thought Reddit’s patented lack of foresight regarding technology was mostly located in /r/technology, and yet…
The way I see it, with the pace at which this field moves, those sorts of objections aren’t worth the energy required to type them. They’ll be obsolete and irrelevant by the time you finish writing them.
farmingvillein t1_jccqy2i wrote
Reply to comment by camp4climber in [D] Is there an expectation that epochs/learning rates should be kept the same between benchmark experiments? by TheWittyScreenName
> Generally it would be unfair to claim that you beat benchmark results if you train for 8x more epochs than other methods. Benchmarks exist to ensure that methods are on a somewhat level playing field. There's certainly some wiggle room depending on the task, but in this case I don't believe that a lower learning rate and more epochs is novel or interesting enough to warrant a full paper.
Although the Llama paper is a bit of a rejoinder here, since training longer is (arguably) their core contribution.
VelveteenAmbush t1_jcd760v wrote
Reply to comment by professorlust in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
They're purposefully withholding the information you'd need to use their results in research. This proposed research boycott is sort of a "you can't fire me, I quit" response.