Raise_Fickle t1_jc1p9x5 wrote on March 13, 2023 at 12:13 PM

Reply to [P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM by Amazing_Painter_7692

Anyone having any luck finetuning LLama in a multi-gpu setup?

Kinexity t1_jc1lwah wrote on March 13, 2023 at 11:37 AM

Reply to comment by light24bulbs in [P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM by Amazing_Painter_7692

That is fast. We are literally talking about a high end laptop CPU from 5 years ago running a 30B LLM.

EcstaticStruggle t1_jc1jts4 wrote on March 13, 2023 at 11:13 AM

Reply to [D] Simple Questions Thread by AutoModerator

How do you combine hyper parameter optimization with early stopping in cross-validation for LightGBM?

Do you:

Use the same validation set for hyperparameter performance estimation as well as early stopping evaluation (e.g., 80% training, 20% early stopping + validation set)
Create a separate fold within cross-validation for early stopping evaluation. (e.g. 80%, 10%, 10% training, early stopping, validation set)
Set aside a different dataset altogether (like a test set) which is constantly used for early stopping across different cross-validation folds for early stopping evaluation.

In the case of 1) and 2), how would you use early stopping once you identified optimal hyperparameters? Normally, you would re-fit on the entire dataset with the best hyperparameters, but this removes the early stopping data.

KerfuffleV2 t1_jc1jtg5 wrote on March 13, 2023 at 11:13 AM

Reply to [P] ChatRWKV v2 (can run RWKV 14B with 3G VRAM), RWKV pip package, and finetuning to ctx16K by bo_peng

/u/bo_peng

I didn't want to clutter up the issue here: https://github.com/BlinkDL/ChatRWKV/issues/30#issuecomment-1465226569

In case this information is useful for you:

strategy	time	tps	tokens
`cuda fp16 0+ -> cuda fp16 10`	45.44	1.12	51
`cuda fp16 0+ -> cuda fp16 5`	43.73	0.94	41
`cuda fp16 0+ -> cuda fp16 1`	52.7	0.83	44
`cuda fp16 0+ -> cpu fp32 1`	59.06	0.81	48
`cuda fp16i8 12 -> cuda fp16 0+ -> cpu fp32 *1`	65.41	0.69	45

I ran the tests using this frontend: https://github.com/oobabooga/text-generation-webui

It was definitely using rwkv version 0.3.1

env RKWV_JIT_ON=1 python server.py \ 
  --rwkv-cuda-on \ 
  --rwkv-strategy STRATEGY_HERE \ 
  --model RWKV-4-Pile-7B-20230109-ctx4096.pth

For each test, I let it generate a few tokens first to let it warm up, then stopped it and let it generate a decent number. Hardware is a Ryzen 5 1600, 32GB RAM, GeForce GTX 1060 6GB VRAM.

Surprisingly, streaming everything as fp16 was still faster than putting 12 fp16i8 layers in VRAM. A 1060 is a pretty old card, so maybe it has unusual behavior dealing with that format. I'm not sure.

Necessary_Ad_9800 t1_jc1j36g wrote on March 13, 2023 at 11:04 AM

Reply to comment by MorallyDeplorable in [P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM by Amazing_Painter_7692

Where can I see stuff generated from this model?

Guitargamer57 t1_jc1hhjz wrote on March 13, 2023 at 10:44 AM

Reply to [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

I tested it using Japanese and it seems like it misses punctuation for the most part. But, overall, seems to be doing a good job getting the words.

ihexx t1_jc1g3bd wrote on March 13, 2023 at 10:25 AM

Reply to comment by boyetosekuji in [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

my guess is model size

I1onza t1_jc1g0u9 wrote on March 13, 2023 at 10:24 AM

Reply to [D] Simple Questions Thread by AutoModerator

I'm a material engineering student and an outsider to the ML and AI community. During my studies I take notes on my laptop and don't have a quick and reliable solution for copying down simple graphs. With recent publicity of AI models I was wondering if someone already tried to train a model to draw graphs form natural language. DALL - E does it quite horribly (Cf. picture ). If you haven't heard of such a thing, maybe its a project you might find interesting to make.

filisterr t1_jc1c8lp wrote on March 13, 2023 at 9:31 AM

Reply to [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

So is this post kind of a hidden advertisement or what?

filisterr t1_jc1c7ss wrote on March 13, 2023 at 9:31 AM

Reply to comment by Deep-Station-1746 in [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

Yes, that's probably a cherry picking marketing only.

filisterr t1_jc1c3aa wrote on March 13, 2023 at 9:29 AM

Reply to comment by Bulky_Highlight_3352 in [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

Typical, basing your research on open source projects and then make a commercial product on top of other people's work. Great achievement.

firecz t1_jc1bwjs wrote on March 13, 2023 at 9:26 AM

Reply to [R] Created a Discord server with LLaMA 13B by ortegaalfredo

I would love to see this as a (Windows) GUI, similar to what some Stable Diffusion solutions do (nmkd, grisk...) - the entire code running offline on your PC, not sending something to Discord or elsewhere.
This would open it to masses, which in turn would pour in more money for research.

Jean-Porte t1_jc1axno wrote on March 13, 2023 at 9:12 AM

Reply to [N] Man beats machine at Go in human victory over AI : « It shows once again we’ve been far too hasty to ascribe superhuman levels of intelligence to machines. » by fchung

-Machine find a strategy to beat machine

-Human implements the strategy and beats machine

-Therefore human beats machine

Deep-Station-1746 t1_jc196cg wrote on March 13, 2023 at 8:47 AM

Reply to [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

>25% improvement over Whisper
>Not open source
>doubt.jpeg

gaybooii t1_jc1950l wrote on March 13, 2023 at 8:46 AM

Reply to comment by brandonZappy in [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

Lmaoo

KerfuffleV2 t1_jc18f6a wrote on March 13, 2023 at 8:36 AM

Reply to comment by Select_Beautiful8 in [P] ChatRWKV v2 (can run RWKV 14B with 3G VRAM), RWKV pip package, and finetuning to ctx16K by bo_peng

Huh, that's weird. You can try reducing the first one from 7 to 6 or maybe even 5:

cuda fp16 *6 -&gt; cuda fp16 *0+ -&gt; cpu fp32 *1

Also, be sure to double check for typos. :) Any incorrect numbers/punctuation will probably cause problems. Especially the "+" in the second part.

CashyJohn t1_jc184r4 wrote on March 13, 2023 at 8:32 AM

Reply to [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

Wav2vec2 is still sota as long as this isn’t open source it’s kinda useless lmao

Balance- t1_jc16bi6 wrote on March 13, 2023 at 8:05 AM

Reply to [P] Introducing confidenceinterval, the long missing python library for computing confidence intervals by jacobgil

Looks awesome!

I would also post at r/Python and/or r/DataScience

AsIAm t1_jc168cw wrote on March 13, 2023 at 8:04 AM

Reply to comment by Taenk in [P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM by Amazing_Painter_7692

It is. But that doesn't mean 1-bit neural nets are impossible. Even Turing himself toyed with such networks – https://www.npl.co.uk/getattachment/about-us/History/Famous-faces/Alan-Turing/80916595-Intelligent-Machinery.pdf?lang=en-GB

[deleted] t1_jc1652y wrote on March 13, 2023 at 8:02 AM

Reply to [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

[removed]

Viacheslav_Varenia t1_jc13mkn wrote on March 13, 2023 at 7:26 AM

Reply to [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

Does it support Ukrainian and Russian?

nucLeaRStarcraft t1_jc1334g wrote on March 13, 2023 at 7:19 AM

Reply to [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

Why is this tagged [R]. This is a commercial project at best. Where's the paper, where's the code? Can we use it today on our PC like whisper? This really isn't 'research'.

kuraisle t1_jc11j00 wrote on March 13, 2023 at 6:57 AM

Reply to comment by Simusid in [D] Simple Questions Thread by AutoModerator

That's really helpful, thank you!

HotRecognition0121 t1_jc10odk wrote on March 13, 2023 at 6:45 AM

Reply to [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

are there wer for other languages? Like in the github page for whisper? I want to compare the performance in other languages

futilehabit t1_jc10obb wrote on March 13, 2023 at 6:45 AM

Reply to comment by cr125rider in [P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM by Amazing_Painter_7692

Guess hospice is pretty boring

Recent comments in /f/MachineLearning