2lazy2buy t1_jcaary6 wrote on March 15, 2023 at 12:38 PM

Reply to [D] Simple Questions Thread by AutoModerator

How is one achieving long context lengths for LLM? Chatgpt has a length 32k? Is the transformer decoder "just" that big?

SuperTankMan8964 t1_jcaaoy4 wrote on March 15, 2023 at 12:37 PM

Reply to comment by trajo123 in [N] Baidu to Unveil Conversational AI ERNIE Bot on March 16 (Live) by kizumada

Pretty sure you need to verify your account with personal info (e.g. national ID card, and ID-binded cellphone number) to use their services, which in this case they can track you down if you dare to speak against the supreme leader.

[deleted] OP t1_jca9un1 wrote on March 15, 2023 at 12:30 PM

Reply to [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

[removed]

respeckKnuckles t1_jca9td9 wrote on March 15, 2023 at 12:29 PM

Reply to comment by YouAgainShmidhoobuh in [D] On research directions being "out of date" by redlow0992

Have you never been peer reviewed before?

NoScallion2450 t1_jca9r94 wrote on March 15, 2023 at 12:29 PM

Reply to comment by MysteryInc152 in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

maybe but what about this? https://www.cnbc.com/2019/04/18/apple-paid-5-billion-to-6-billion-to-settle-with-qualcomm-ubs.html

xEdwin23x t1_jca9kr0 wrote on March 15, 2023 at 12:27 PM

Reply to comment by NoScallion2450 in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

OpenAI is not a small company either. It may be a "startup" but it's clearly backed by Microsoft and between those two there's probably quite a lot of patents that Google have used in the past too.

MysteryInc152 t1_jca93qy wrote on March 15, 2023 at 12:23 PM

Reply to [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

I don't think patent battles will go anywhere. DeepMind could simply stop releasing papers (or curtail it significantly) like they've already hinted they might do.

NoScallion2450 t1_jca8qxo wrote on March 15, 2023 at 12:20 PM

Reply to comment by xEdwin23x in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

What do you say so? For Google its probably peanuts in terms of cost. And there is a clear case for them to make that transfomers originated with them.

xEdwin23x t1_jca8d58 wrote on March 15, 2023 at 12:16 PM

Reply to comment by NoScallion2450 in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

It's probably not in their interest as they know they both will end up worse if they decide to follow that path.

knobbyknee t1_jca8489 wrote on March 15, 2023 at 12:14 PM

Reply to Modern language models refute Chomsky’s approach to language [R] by No_Draft4778

I found this article to be extremely insightful. The best I have seen that explains in layman terms why large language models work, and what yheir limitations are.

ggf31416 t1_jca7zwz wrote on March 15, 2023 at 12:13 PM

Reply to [D] Choosing Cloud vs local hardware for training LLMs. What's best for a small research group? by PK_thundr

https://fullstackdeeplearning.com/cloud-gpus/

Your best bet to reach 256Gb in the cloud would be Azure with 4x80GB A100 instances, however your 40k budget will only buy you 3000 hours of compute at best on demand, with spot instances stretching that a bit further.

If that's not enough for you then you will have to figure out how to make a server with RTX A6000 Adas with 48GB each. RTX4090 would be cheaper but there may be legal issues due to the gaming driver license, you would need to use multiple servers due to power usage or strongly limit the power limit, and Nvidia dropped P2P that may o may not matter depending on how much communication you need between the GPUs (https://discuss.pytorch.org/t/ddp-training-on-rtx-4090-ada-cu118/168366)

NoScallion2450 t1_jca7ykx wrote on March 15, 2023 at 12:13 PM

Reply to comment by xEdwin23x in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

Not saying Google is better or OpenAI is better. But could they now be engaging in patent battles as it seems like now there is significant comercial interest at stake? And also OpenAI not releasing any details means for AI research going forward.

harharveryfunny t1_jca7x9f wrote on March 15, 2023 at 12:12 PM

Reply to Modern language models refute Chomsky’s approach to language [R] by No_Draft4778

Yes - the Transformer is proof by demonstration that you don't need a language-specific architecture to learn language, and also that you can learn language via prediction feedback, which it highly likely how our brain does it too.

Chomsky is still he sticking to his innateness opinion though (with Gary Marcus cheering him on). Perhaps Chomsky will now claim that Broca's area is a Transformer?

xEdwin23x t1_jca7kxx wrote on March 15, 2023 at 12:09 PM

Reply to [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

Google has probably used stuff from OpenAI too (decoder only GPT-style training or CLIP or Diffusion or Dall-E ideas maybe?). Anyways, it's clear they (and probably every large tech company with big AI teams) are in an arms race at this point. Its definitely not a coincidence Google, OpenAI / Microsoft released on the same day, and we also heard Baidu is releasing sometime these days. Meta and others will be probably following suite. The publicity (and the market share for this new technologies) is worth too much.

etesian_dusk t1_jca2or6 wrote on March 15, 2023 at 11:20 AM

Reply to Modern language models refute Chomsky’s approach to language [R] by No_Draft4778

This looks like a glorified blogpost. It even has sensationalistic bits that add nothing to the subject

>there is no possibility that Trump understands prime numbers

Sonicxc t1_jca1rh1 wrote on March 15, 2023 at 11:09 AM

Reply to [D] Simple Questions Thread by AutoModerator

How can i train a model so that it detects severity of damage in a image. Which algo will suit for my need?

trajo123 t1_jca1bh4 wrote on March 15, 2023 at 11:05 AM

Reply to [N] Baidu to Unveil Conversational AI ERNIE Bot on March 16 (Live) by kizumada

CCP trolling and disinformation capabilities just got a massive upgrade.

ClassicJewJokes t1_jc9q98x wrote on March 15, 2023 at 8:39 AM

Reply to comment by YouAgainShmidhoobuh in [D] On research directions being "out of date" by redlow0992

Doesn't matter to most reviewers. Little care for accessibility as well, remember the flak you'd get for not using MuJoCo in a RL paper when it wasn't opensource.

SnooHesitations8849 t1_jc9p18n wrote on March 15, 2023 at 8:22 AM

Reply to comment by PK_thundr in [D] Choosing Cloud vs local hardware for training LLMs. What's best for a small research group? by PK_thundr

Ah. I see then you dont actually have many choices. LoL

sanderbaduk t1_jc9o4hm wrote on March 15, 2023 at 8:09 AM

Reply to [D] Choosing Cloud vs local hardware for training LLMs. What's best for a small research group? by PK_thundr

Training for what? Classification, embedding, generation?

Magnesus t1_jc9ljj5 wrote on March 15, 2023 at 7:33 AM

Reply to Modern language models refute Chomsky’s approach to language [R] by No_Draft4778

A lot of Chomsky ideas were refuted and his defence of Russia is simply awful. The man is more often wrong than right.

Select_Beautiful8 t1_jc9lckr wrote on March 15, 2023 at 7:31 AM

Reply to comment by KerfuffleV2 in [P] ChatRWKV v2 (can run RWKV 14B with 3G VRAM), RWKV pip package, and finetuning to ctx16K by bo_peng

just got time to try it, but it doesn't load nor does it give error message :( Thanks anyways for your help!

Disastrous_Elk_6375 t1_jc9ks2y wrote on March 15, 2023 at 7:23 AM

Reply to comment by PK_thundr in [D] Choosing Cloud vs local hardware for training LLMs. What's best for a small research group? by PK_thundr

There's a rent vs buy section in that article. It basically comes down to how much will you use the box, vs. how often / fast do you need to test things out. They go through energy costs and all that in the article. Just plug in your figures and see what the output is.

bearific t1_jc9jo5y wrote on March 15, 2023 at 7:08 AM

Reply to comment by YouAgainShmidhoobuh in [D] On research directions being "out of date" by redlow0992

Yet when my sister submitted a paper before ChatGPT was released, she got complaints that she did not evaluate on ChatGPT literally days after it was released

PK_thundr OP t1_jc9imve wrote on March 15, 2023 at 6:54 AM

Reply to comment by SnooHesitations8849 in [D] Choosing Cloud vs local hardware for training LLMs. What's best for a small research group? by PK_thundr

Their pricing is definitely much better than AWS and google cloud, unfortunately they don't seem to be HIPAA compliant or this feature is not advertised.

Recent comments in /f/MachineLearning