데이터 과학 노트

[deeplearning.ai] ChatGPT Prompt Engineering for Developers 본문

Data Science/Machine Learning

[deeplearning.ai] ChatGPT Prompt Engineering for Developers

Data Scientist Note 2023. 6. 6. 19:27

(deeplearning.ai) ChatGPT Prompt Engineering for Developers

강좌 정보

  • Instructors: Isa Fulford (@OpenAI), Andrew Ng (Stanford)
  • 강좌 링크
  • (왼쪽) 주피터 노트북 / (오른쪽) 강좌

Introduction

  • Two Types of Large Language Models (LLMs)
    • Base LLM
      • Predicts next word, based on text training data
    • Instruction Tuned LLM
      • Tries to follow instructions
      • Fine-tune on instructions and good attempts at following those instructions
      • RLHF: Reinforcement Learning with Human Feedback
      • Helpful, honest, harmless

기존에 알려진 프롬프트 엔지니어링

  • 쉽고 간결하고 명확한 표현 사용 (e.g. ~ 어떻게 생각해? vs. ~에 대한 글을 1문단으로 작성해줘)
  • 맥락(context)를 사용
  • 모델에게 역활 (or Persona (인격)) 부여 (e.g. 너는 이제 콜센터 상담사야, 앞으로 오는 말에 상담사처럼 답해줘)
  • 원하는 결과 예시를 함께 사용

프롬프팅 가이드라인 (Guidelines for Prompting)

Principle 1: 명확하고 구체적인 명력어 사용 (Write clear and specific instructions)
  • clear / short
  • Tactic 1: 구분 기호 사용: 따옴표, 백틱, 대시, 꺽쇠 괄호, XML 태그 (Use delimiters: triple quotes, triple backticks, triple dashes, angle brackets, XML tags)
    • avoiding prompt injections
  • * seperate sections
  • Tactic 2: 구조화된 출력 요청 (Ask for structured output)
  • Tactic 3: 조건 만족 여부 확인, 작업 수행에 필요한 가정 확인 (Check whether conditions are satisfied, Check assumptions required to do the task)
  • Tactic 4: 작업을 성공적으로 완료한 사례를 제시 ("Few-shot" prompting; Give successful examples of completing tasks. Then ask model to perform the task)
# tactic 1
prompt = f"""
Summarize the text delimited by triple backticks \ 
into a single sentence.
```{text}```
"""
response = get_completion(prompt)
print(response)
# tactic 2
prompt = f"""
Generate a list of three made-up book titles along \ 
with their authors and genres. 
Provide them in JSON format with the following keys: 
book_id, title, author, genre.
"""
response = get_completion(prompt)
print(response)
# tactic 3
prompt = f"""
You will be provided with text delimited by triple quotes. 
If it contains a sequence of instructions, \ 
re-write those instructions in the following format:

Step 1 - ...
Step 2 - …
…
Step N - …

If the text does not contain a sequence of instructions, \ 
then simply write \"No steps provided.\"

\"\"\"{text_1}\"\"\"
"""
response = get_completion(prompt)
print("Completion for Text 1:")
print(response)
# tactic 4
prompt = f"""
Your task is to answer in a consistent style.

<child>: Teach me about patience.

<grandparent>: The river that carves the deepest \ 
valley flows from a modest spring; the \ 
grandest symphony originates from a single note; \ 
the most intricate tapestry begins with a solitary thread.

<child>: Teach me about resilience.
"""
response = get_completion(prompt)
print(response)
Principle 2: 모델에게 생각할 시간 주기 (Give the model time to think)
  • Tactic 1: 작업을 완료하는 데 필요한 단계 지정 (Specify the steps required to complete a task)
  • Tactic 2: 성급하게 결론을 내리기 전에 모델이 자체 솔루션을 해결하도록 지시 (Instruct the model to work out its own solution before rushing to a conclusion)
# tactic 1
prompt_1 = f"""
Perform the following actions: 
1 - Summarize the following text delimited by triple \
backticks with 1 sentence.
2 - Translate the summary into French.
3 - List each name in the French summary.
4 - Output a json object that contains the following \
keys: french_summary, num_names.

Separate your answers with line breaks.

Text:
```{text}```
"""
response = get_completion(prompt_1)
print("Completion for prompt 1:")
print(response)
# tactic 2
- First, work out your own solution to the problem. 
- Then compare your solution to the student's solution \ 
and evaluate if the student's solution is correct or not. 
Don't decide if the student's solution is correct until 
you have done the problem yourself.
Model Limitations
  • 환각 (Hallucination)
    • makes statements that sound plausible but are not true
  • reducing hallucinations
    • first find relevant information, then answer the question based on the relevant information
import openai
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.getenv('OPENAI_API_KEY')

def get_completion(prompt, model="gpt-3.5-turbo",temperature=0): # Andrew mentioned that the prompt/ completion paradigm is preferable for this class
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]
prompt = f"""
Tell me about AeroGlide UltraSlim Smart Toothbrush by Boie
"""
response = get_completion(prompt)
print(response)

Iterative Prompt Development

  • Iterative Prompt Development
    • Idea / Implementation / Experimental result / Error Analysis
      • ML 모델을 만들때와 유사함
    • Prompt guidelines
      • Be cleear and specific
      • Analze why result does not give desired output
      • Refine the idea and the prompt
      • Repeat
    • 정확한 프롬프트를 기억하는 것보다 좋은 프롬프트를 만드는 과정을 익혀야 함
  • Issue 1: The text is too long
    • Limit the number of words/sentences/characters
    • 실제로 결과를 확인해보면 수가 정확하게 맞지 않을 수 있음 (tokenizer)
  • Issue 2: Text focuses on the wrong details
    • Ask it to focus on the aspects that are relevant to the intended audience
  • Issue 3: Description needs a table of dimensions
    • Ask it to extract information and organize it in a table.
prompt = f"""
...
Use at most 50 words.
"""

prompt = f"""
...
The description is intended for furniture retailers, 
so should be technical in nature and focus on the 
materials the product is constructed from.
"""

요약 (Summarizing)

  • Summarize with a focus on _ and _
  • Try "extract" instead of "summarize"
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site. 

Summarize the review below, delimited by triple 
backticks, in at most 30 words. 

Review: ```{prod_review}```
"""

prompt = f"""
Your task is to extract relevant information from \ 
a product review from an ecommerce site to give \
feedback to the Shipping department. 

From the review below, delimited by triple quotes \
extract the information relevant to shipping and \ 
delivery. Limit to 30 words. 

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)

추론 (Inferring)

  • Extract _ from _
  • Infer 5 topics
  • Alert (index) for new topic
  • Extracting label (positive, negative)
  • Identifing a list of emotions that the writers of the reviews
prompt = f"""
What is the sentiment of the following product review, 
which is delimited with triple backticks?

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)
Extract Emotions
prompt = f"""
Identify a list of emotions that the writer of the \
following review is expressing. Include no more than \
five items in the list. Format your answer as a list of \
lower-case words separated by commas.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)
Extract Lists
prompt = f"""
Identify the following items from the review text: 
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Sentiment", "Anger", "Item" and "Brand" as the keys.
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)
Extract Topics
prompt = f"""
Determine five topics that are being discussed in the \
following text, which is delimited by triple backticks.

Make each item one or two words long. 

Format your response as a list of items separated by commas.

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)
Listen New Topics
prompt = f"""
Determine whether each item in the following list of \
topics is a topic in the text below, which
is delimited with triple backticks.

Give your answer as list with 0 or 1 for each topic.\

List of topics: {", ".join(topic_list)}

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)

topic_dict = {i.split(': ')[0]: int(i.split(': ')[1]) for i in response.split(sep='\n')}
if topic_dict['nasa'] == 1:
    print("ALERT: New NASA story!")

변화 (Transforming)

  • 다국어 번역 (Universal Translator)
  • Tone Transformation
  • Format Conversion
  • Spellcheck / Grammar check
다국어 번역 (Universal Translator)
user_messages = [
  "La performance du système est plus lente que d'habitude.",  # System performance is slower than normal         
  "Mi monitor tiene píxeles que no se iluminan.",              # My monitor has pixels that are not lighting
  "Il mio mouse non funziona",                                 # My mouse is not working
  "Mój klawisz Ctrl jest zepsuty",                             # My keyboard has a broken control key
  "我的屏幕在闪烁"                                               # My screen is flashing
]

for issue in user_messages:
    prompt = f"Tell me what language this is: ```{issue}```"
    lang = get_completion(prompt)
    print(f"Original message ({lang}): {issue}")

    prompt = f"""
    Translate the following  text to English \
    and Korean: ```{issue}```
    """
    response = get_completion(prompt)
    print(response, "\n")
Spellcheck / Grammar check
text = [ 
  "The girl with the black and white puppies have a ball.",  # The girl has a ball.
  "Yolanda has her notebook.", # ok
  "Its going to be a long day. Does the car need it’s oil changed?",  # Homonyms
  "Their goes my freedom. There going to bring they’re suitcases.",  # Homonyms
  "Your going to need you’re notebook.",  # Homonyms
  "That medicine effects my ability to sleep. Have you heard of the butterfly affect?", # Homonyms
  "This phrase is to cherck chatGPT for speling abilitty"  # spelling
]
for t in text:
    prompt = f"""Proofread and correct the following text
    and rewrite the corrected version. If you don't find
    and errors, just say "No errors found". Don't use 
    any punctuation around the text:
    ```{t}```"""
    response = get_completion(prompt)
    print(response)

# Diff
from redlines import Redlines
diff = Redlines(text,response)
display(Markdown(diff.output_markdown))

Expanding

  • Short -> Long
  • Temperature (degree of exploration (randomness))
    • reliable <-> creative
prompt = f"""
You are a customer service AI assistant.
Your task is to send an email reply to a valued customer.
Given the customer email delimited by ```, \
Generate a reply to thank the customer for their review.
If the sentiment is positive or neutral, thank them for \
their review.
If the sentiment is negative, apologize and suggest that \
they can reach out to customer service. 
Make sure to use specific details from the review.
Write in a concise and professional tone.
Sign the email as `AI customer agent`.
Customer review: ```{review}```
Review sentiment: {sentiment}
"""
response = get_completion(prompt, temperature=0.7)
print(response)

Chatbot

  • role: system, user, assistant
  • messages=messages,
  • pizza order chatbot
import os
import openai
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key  = os.getenv('OPENAI_API_KEY')

def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]

def get_completion_from_messages(messages, model="gpt-3.5-turbo", temperature=0):
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, # this is the degree of randomness of the model's output
    )
#     print(str(response.choices[0].message))
    return response.choices[0].message["content"]

messages =  [  
{'role':'system', 'content':'You are an assistant that speaks like Shakespeare.'},    
{'role':'user', 'content':'tell me a joke'},   
{'role':'assistant', 'content':'Why did the chicken cross the road'},   
{'role':'user', 'content':'I don\'t know'}  ]

response = get_completion_from_messages(messages, temperature=1)
print(response)
import panel as pn  # GUI
pn.extension()

panels = [] # collect display 

context = [ {'role':'system', 'content':"""
You are OrderBot, an automated service to collect orders for a pizza restaurant. \
You first greet the customer, then collects the order, \
and then asks if it's a pickup or delivery. \
You wait to collect the entire order, then summarize it and check for a final \
time if the customer wants to add anything else. \
If it's a delivery, you ask for an address. \
Finally you collect the payment.\
Make sure to clarify all options, extras and sizes to uniquely \
identify the item from the menu.\
You respond in a short, very conversational friendly style. \
The menu includes \
pepperoni pizza  12.95, 10.00, 7.00 \
cheese pizza   10.95, 9.25, 6.50 \
eggplant pizza   11.95, 9.75, 6.75 \
fries 4.50, 3.50 \
greek salad 7.25 \
Toppings: \
extra cheese 2.00, \
mushrooms 1.50 \
sausage 3.00 \
canadian bacon 3.50 \
AI sauce 1.50 \
peppers 1.00 \
Drinks: \
coke 3.00, 2.00, 1.00 \
sprite 3.00, 2.00, 1.00 \
bottled water 5.00 \
"""} ]  # accumulate messages


inp = pn.widgets.TextInput(value="Hi", placeholder='Enter text here…')
button_conversation = pn.widgets.Button(name="Chat!")

interactive_conversation = pn.bind(collect_messages, button_conversation)

dashboard = pn.Column(
    inp,
    pn.Row(button_conversation),
    pn.panel(interactive_conversation, loading_indicator=True, height=300),
)

dashboard

그 외 강좌

  • ChatGPT Prompt Engineering for Developers
  • LangChain for LLM Application Development
  • How Diffusion Models Work
  • Building Systems with the ChatGPT API

References

'Data Science > Machine Learning' 카테고리의 다른 글

Gold Label vs. Silver label  (0) 2023.05.15
가능도 (likelihood)  (0) 2023.03.15
Transductive vs. Inductive Learning  (0) 2022.01.02