Recent comments in /f/MachineLearning

dojoteef OP t1_jc6om7a wrote

Thanks for the vote of confidence!

Unfortunately, I recently deleted my twitter account 🫣. I was barely active there: a handful of tweets in nearly a decade and a half...

That said, I'll probably post my preprint on this sub when it's ready. I also need to recruit some play testers, so will probably post on r/discoelysium recruiting participants in the next few weeks (to ensure high quality evaluations we need people who have played the game before, rather than using typical crowdsourcing platforms like MTurk).

1

ushtari_tk421 t1_jc6lh0m wrote

Am I off base thinking it’s silly that a program that just generates text (some of which might be offensive) has to contain a disclaimer that it isn’t “harmless”? Seems like the worst case risk scenario is that it says something that we would hold against a person/be offended by if they said it?

4

sEi_ t1_jc6k44a wrote

IMPORTANT

I see an influx of POSTS WITHOUT REFERENCES.

When you in the start say: "recently resolved by researchers" and I see no blue link I can check, then I scroll past the post.

And even "The paper..." many times. What paper?

I simply ignore posts like this. Life is too short to read peoples dreams.

EDIT:
When citing stuff please put the link in the body of the post. So I do not have to search for it down the thread.

4

sebzim4500 t1_jc6jye3 wrote

The company doesn't always win, sometimes the open source product is simply better. See Stable Diffusion vs DALL-E, or linux vs windows server, or lichess vs chess.com, etc.

Of course that doesn't mean it will be used more, but that isn't the point.

5

gmork_13 t1_jc6e3ox wrote

The way I was going to implement it with the chatgpt API was to store the conversation and have the model itself extract keywords of the conversation so far as it neared the token limit.

Then you can inject the keywords and search the previous conversation.

But this is still nothing like truly extending the actual memory of the model.

1

zackline t1_jc69d50 wrote

I am not sure about it, but I have heard that it’s at the moment not possible to use CUDA while running a game because supposedly the GPU needs to enter a different mode or something like that.

If that should indeed be the case it might even be a hardware limitation that prevents this use case on current GPUs.

2

folk_glaciologist t1_jc68a4q wrote

You can use searches to augment the responses. You can write a python script to do this yourself via the API, making use of the fact that you can write prompts that ask ChatGPT questions about prompts. For example this is a question that will cause ChatGPT to hallucinate:

> Who are some famous people from Palmerston North?

But you can prepend some text to the prompt like this:

> I want you to give me a topic I could search Wikipedia for to answer the question below. Just output the name of the topic by itself. If the text that follows is not a request for information or is asking to generate something, it is very important to output "not applicable". The question is: <your original prompt>

If it outputs "not applicable" or searching Wikipedia with the returned topic returns nothing, then just reprocess the original prompt raw. Otherwise download the Wikipedia article (or first few paragraphs), prepend to original prompt and ask again. Etc.

In general I think that using LLMs as giant databases is the wrong approach because even if we can stop them hallucinating they will always be out of date because of the time lag to retrain them, we should be using their NLP capabilities to turn user questions into "machine-readable" (whatever that means nowadays) queries that get run behind the scenes and then fed back into the LLM. Like Bing chat doing web searches basically.

1

femboyxx98 t1_jc601pw wrote

The actual implementation of most models is quite simple and he often reuses the same building blocks. The challenge is obtaining the dataset and actually training the models (and hyper parameter search) and he doesn’t provide any trained weights himself - it’s hard to know if his implementations even work out of the box.

9

generatorman_ai t1_jc5w4m9 wrote

The general problem of generative NPCs seems like a subset of robotics rather than pure language models, so that still seems some way off (but Google made some progress with PaLM-E).

LLMs and Disco Elysium sounds like the coolest paper ever! I would love to follow you on twitter to get notified when you release the preprint.

4