Recent comments in /f/MachineLearning
SuperTankMan8964 t1_jcaaoy4 wrote
Reply to comment by trajo123 in [N] Baidu to Unveil Conversational AI ERNIE Bot on March 16 (Live) by kizumada
Pretty sure you need to verify your account with personal info (e.g. national ID card, and ID-binded cellphone number) to use their services, which in this case they can track you down if you dare to speak against the supreme leader.
[deleted] OP t1_jca9un1 wrote
respeckKnuckles t1_jca9td9 wrote
Reply to comment by YouAgainShmidhoobuh in [D] On research directions being "out of date" by redlow0992
Have you never been peer reviewed before?
NoScallion2450 t1_jca9r94 wrote
Reply to comment by MysteryInc152 in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
maybe but what about this? https://www.cnbc.com/2019/04/18/apple-paid-5-billion-to-6-billion-to-settle-with-qualcomm-ubs.html
xEdwin23x t1_jca9kr0 wrote
Reply to comment by NoScallion2450 in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
OpenAI is not a small company either. It may be a "startup" but it's clearly backed by Microsoft and between those two there's probably quite a lot of patents that Google have used in the past too.
MysteryInc152 t1_jca93qy wrote
Reply to [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
I don't think patent battles will go anywhere. DeepMind could simply stop releasing papers (or curtail it significantly) like they've already hinted they might do.
NoScallion2450 t1_jca8qxo wrote
Reply to comment by xEdwin23x in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
What do you say so? For Google its probably peanuts in terms of cost. And there is a clear case for them to make that transfomers originated with them.
xEdwin23x t1_jca8d58 wrote
Reply to comment by NoScallion2450 in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
It's probably not in their interest as they know they both will end up worse if they decide to follow that path.
knobbyknee t1_jca8489 wrote
I found this article to be extremely insightful. The best I have seen that explains in layman terms why large language models work, and what yheir limitations are.
ggf31416 t1_jca7zwz wrote
Reply to [D] Choosing Cloud vs local hardware for training LLMs. What's best for a small research group? by PK_thundr
https://fullstackdeeplearning.com/cloud-gpus/
Your best bet to reach 256Gb in the cloud would be Azure with 4x80GB A100 instances, however your 40k budget will only buy you 3000 hours of compute at best on demand, with spot instances stretching that a bit further.
If that's not enough for you then you will have to figure out how to make a server with RTX A6000 Adas with 48GB each. RTX4090 would be cheaper but there may be legal issues due to the gaming driver license, you would need to use multiple servers due to power usage or strongly limit the power limit, and Nvidia dropped P2P that may o may not matter depending on how much communication you need between the GPUs (https://discuss.pytorch.org/t/ddp-training-on-rtx-4090-ada-cu118/168366)
NoScallion2450 t1_jca7ykx wrote
Reply to comment by xEdwin23x in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
Not saying Google is better or OpenAI is better. But could they now be engaging in patent battles as it seems like now there is significant comercial interest at stake? And also OpenAI not releasing any details means for AI research going forward.
harharveryfunny t1_jca7x9f wrote
Yes - the Transformer is proof by demonstration that you don't need a language-specific architecture to learn language, and also that you can learn language via prediction feedback, which it highly likely how our brain does it too.
Chomsky is still he sticking to his innateness opinion though (with Gary Marcus cheering him on). Perhaps Chomsky will now claim that Broca's area is a Transformer?
xEdwin23x t1_jca7kxx wrote
Reply to [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
Google has probably used stuff from OpenAI too (decoder only GPT-style training or CLIP or Diffusion or Dall-E ideas maybe?). Anyways, it's clear they (and probably every large tech company with big AI teams) are in an arms race at this point. Its definitely not a coincidence Google, OpenAI / Microsoft released on the same day, and we also heard Baidu is releasing sometime these days. Meta and others will be probably following suite. The publicity (and the market share for this new technologies) is worth too much.
etesian_dusk t1_jca2or6 wrote
This looks like a glorified blogpost. It even has sensationalistic bits that add nothing to the subject
>there is no possibility that Trump understands prime numbers
Sonicxc t1_jca1rh1 wrote
Reply to [D] Simple Questions Thread by AutoModerator
How can i train a model so that it detects severity of damage in a image. Which algo will suit for my need?
trajo123 t1_jca1bh4 wrote
CCP trolling and disinformation capabilities just got a massive upgrade.
ClassicJewJokes t1_jc9q98x wrote
Reply to comment by YouAgainShmidhoobuh in [D] On research directions being "out of date" by redlow0992
Doesn't matter to most reviewers. Little care for accessibility as well, remember the flak you'd get for not using MuJoCo in a RL paper when it wasn't opensource.
SnooHesitations8849 t1_jc9p18n wrote
Reply to comment by PK_thundr in [D] Choosing Cloud vs local hardware for training LLMs. What's best for a small research group? by PK_thundr
Ah. I see then you dont actually have many choices. LoL
sanderbaduk t1_jc9o4hm wrote
Reply to [D] Choosing Cloud vs local hardware for training LLMs. What's best for a small research group? by PK_thundr
Training for what? Classification, embedding, generation?
Magnesus t1_jc9ljj5 wrote
A lot of Chomsky ideas were refuted and his defence of Russia is simply awful. The man is more often wrong than right.
Select_Beautiful8 t1_jc9lckr wrote
Reply to comment by KerfuffleV2 in [P] ChatRWKV v2 (can run RWKV 14B with 3G VRAM), RWKV pip package, and finetuning to ctx16K by bo_peng
just got time to try it, but it doesn't load nor does it give error message :( Thanks anyways for your help!
Disastrous_Elk_6375 t1_jc9ks2y wrote
Reply to comment by PK_thundr in [D] Choosing Cloud vs local hardware for training LLMs. What's best for a small research group? by PK_thundr
There's a rent vs buy section in that article. It basically comes down to how much will you use the box, vs. how often / fast do you need to test things out. They go through energy costs and all that in the article. Just plug in your figures and see what the output is.
bearific t1_jc9jo5y wrote
Reply to comment by YouAgainShmidhoobuh in [D] On research directions being "out of date" by redlow0992
Yet when my sister submitted a paper before ChatGPT was released, she got complaints that she did not evaluate on ChatGPT literally days after it was released
PK_thundr OP t1_jc9imve wrote
Reply to comment by SnooHesitations8849 in [D] Choosing Cloud vs local hardware for training LLMs. What's best for a small research group? by PK_thundr
Their pricing is definitely much better than AWS and google cloud, unfortunately they don't seem to be HIPAA compliant or this feature is not advertised.
2lazy2buy t1_jcaary6 wrote
Reply to [D] Simple Questions Thread by AutoModerator
How is one achieving long context lengths for LLM? Chatgpt has a length 32k? Is the transformer decoder "just" that big?