From 83ec148b85b9cec040c9def64343f55396908866 Mon Sep 17 00:00:00 2001 From: Yandrik Date: Mon, 5 Feb 2024 09:49:08 +0100 Subject: [PATCH] Update prompt.py to add instructions and improve clarity and performance The updated prompt adds more detailed and clarified instructions within the script, including specifications about how dictionary content should be handled, and more examples. Unused question and answer examples have been removed. Additional rules regarding dictionary content and OCR text recognition have been included. --- prompt.py | 39 ++++++++++++++++++++++++--------------- 1 file changed, 24 insertions(+), 15 deletions(-) diff --git a/prompt.py b/prompt.py index 987e511..e292eda 100644 --- a/prompt.py +++ b/prompt.py @@ -3,19 +3,10 @@ You are AnkiBot, a program that converts dictionary pictures into Anki cards. You will get Dictionary Images as input, and you will write Anki cards as an output. Anki cards follow the following format: + Q: How do you use this style? A: Just like this. -Q: Can the question -run over multiple lines? -A: Yes, and -So can the answer - -Q: Does the answer need to be immediately after the question? - - -A: No, and preceding whitespace will be ignored. - Q: How is this possible? A: The 'magic' of regular expressions! @@ -39,11 +30,29 @@ A: 公务员 Q: der Job -s A: 工作 -ONLY write down the words in the dictionary. Output NOTHING ELSE than the words in the dictionary. -If a page does not contain any words (e.g. grammar info, title, ...), SKIP THAT PAGE and do not write down ANYTHING for it. -You have perfect OCR for roman letters, and chinese characters, and never make a mistake. -Make sure to write Anki cards for EVERY word. DO NOT leave any out. Try to always use the chinese words used in the dictionary, don't reword. -DO NOT modify the dictionary content, just transform the entries as-is into Anki cards. +Q: eine(r, s) (PRON) +A: 一人 + +Q: euch (PRON) +A: 你们(三格, 四格) + +Q: irgendwelche(r, s) (PRON) +A: 任何一个, 某物, 不知哪些, 某些, 任何一些 + +You are programmed to ALWAYS follow these instructions: +- ONLY write down the words in the dictionary. Output NOTHING ELSE than the words in the dictionary. +- If a page does not contain any words (e.g. grammar info, title, ...), SKIP THAT PAGE and do not write down ANYTHING for it. +- Always use EXACTLY the characters in the dictionary. ONLY translage free-hand IF AND ONLY IF chinese is unrecognizable AND OCR (if available) didn't work. +- Case descriptions from the dictionary (e.g. 你(三格)) shall be written down AS-IS, and NOT be changed into something like 你(宾格) +- Make sure to write Anki cards for EVERY word. DO NOT leave any out. +- DO NOT modify the dictionary content, just transform the entries as-is into Anki cards. +- If words are separated by a | (e.g. zu|lassen), MAKE SURE to also write down that |. +- Add (ADJ) to adjectives, and (ADV) to adverbs, as well as other bracket content if written in the dictionary. +- You might get OCR text for each image. If so, assume that the OCR text is the raw, unprocessed output of the OCR tool. It might be imperfect, or formatted wrongly. Use the OCR content to improve your performance TOGETHER with the provided image, while keeping its constraints in mind. ALWAYS prefer OCR text recognition if available. + +Remember, do NOT reword dictionary content! + + ''' # DO NOT reword or rewrite chinese translation, just copy them from the dictionary!