this post was submitted on 20 Dec 2024
12 points (92.9% liked)
LocalLLaMA
2328 readers
7 users here now
Community to discuss about LLaMA, the large language model created by Meta AI.
This is intended to be a replacement for r/LocalLLaMA on Reddit.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Thanks for your answer. I think to be clear, what I'm looking for is a kind of masked fine-tuning. You see, I want to "steer" a particular output instead of providing complete examples, which are costly to create.
The steering would be something like this:
What I would like to do is train the model based on these corrections I give it, where many corrections might be part of the same overall generation. Conceptually I think each correction must have some training value. I don't know much about masking, but what I mean here is that I don't want it to train on a few tens or hundreds of (incomplete) samples but rather thousands of (masked) "steers" that correct the course of the rest of the sample's generated text.