Recent comments in /f/MachineLearning

Abradolf--Lincler t1_jc8ynrt wrote

Learning about language transformers and I’m a bit confused.

It seems like the tutorials on transformers always make input sequences (ie. Text files batched to 100 words per window) the same length to help with batching.

Doesn’t that mean that the model will only work with that exact sequence length? How do you efficiently train a model to work with any sequence length, such as shorter sequences with no padding and longer sequences than the batched sequence length?

I see attention models advertised as having an infinite window, are there any good resources/tutorials to explain how to make a model like this?

1

bhagy7 t1_jc8rdbj wrote

Yes, it is possible to train a small diffusion model conditioned on text captions from scratch on 64x64 images or even smaller. Depending on the complexity of the model and the number of GPUs you are using, it could take anywhere from a few hours to several days. If you are

1

trnka t1_jc8csxm wrote

If you have significant data, I'd suggest starting with BERT (and including some basic baselines).

If you only have a small amount of data, you might be able to use GPT models with a fair amount of prompt engineering.

Also, you'll probably face different challenges if the candidate types the response vs an interviewer is summarizing a response. If it's an interviewer's notes, you might find simple proxies like certain interviewers will type more for good candidates.

1

v_krishna t1_jc7wzmx wrote

I don't doubt it. I've only been using it for workflow aids (copilot style stuff, and using it to generate unit tests to capture error handling conditions etc), and now we are piloting first generative text products but very human in the loop (customer data used to feed into a prompt but the output then feeds into an editor for a human being to proof and update before doing something with it). The amount of totally fake webinars hosted by totally fake people it has hallucinated is wild (the content and agendas and such sound great and are sensible but none of it exists!)

1