The Atlas Capture Labeling Workshop focuses on the principles of effective labeling for hand-object interactions. It provides guidelines for identifying which actions require labels and emphasizes the importance of using imperative voice in labeling. This workshop is designed for professionals involved in data annotation and machine learning, aiming to enhance accuracy in labeling tasks. Key topics include segment boundaries, timestamp editing, and labeling standards, making it a valuable resource for those in the AI and data science fields.

Key Points

  • Explains the importance of goal-oriented hand-object actions in labeling.
  • Covers critical boundaries for segmenting actions in video data.
  • Details timestamp editing techniques for accurate labeling.
  • Outlines labeling standards to ensure clarity and precision.
Joseph Akujieze
38 pages
Language:English
Type:Guide
Joseph Akujieze
38 pages
Language:English
Type:Guide
231
/ 38
Labeling Workshop
Foundations
What Requires a Label
Do NOT label
No Hand-Object interaction, No Label
Walking or navigation
Looking / inspecting / checking
Idle gestures
Camera or face touches
Irrelevant side actions
Label Requires
Goal-oriented hand-object actions
relevant to the task
Left Hand & Right Hand usage during
hand-object interactions
Object transfers between hands
must be labelled
GIF1.0" ×
0.8"
unfold paper with left hand, pick
up brush with right hand
Label:
GIF1.0" ×
0.8"
pass baking tray in right hand to
left hand
Label:
/ 38
End of Document
231

FAQs

What actions require labeling in the Atlas Capture process?
In the Atlas Capture process, goal-oriented hand-object actions relevant to the task require labeling. This includes actions where the left and right hands are used during hand-object interactions, as well as object transfers between hands. For example, actions like unfolding paper with one hand and picking up a brush with the other must be labeled, while actions such as walking, inspecting, or idle gestures do not require a label.
What are the rules for using imperative voice in labeling?
When writing labels for actions, they should be phrased in the imperative voice as commands. For instance, acceptable labels include 'pick up spoon with right hand' or 'hold cup with left hand, place screw in box with right hand.' Labels that describe actions in the passive voice, such as 'the spoon is picked with right hand,' are not acceptable. This structure helps maintain clarity and directness in the labeling process.
How should boundaries be defined in labeling segments?
Boundaries in labeling segments should be defined by the engagement of hands with objects. A segment starts when the hands begin to engage with an object and ends when the hands disengage or when the goal changes. For example, if a person picks up a nail polish bottle with one hand and then places it in a box, each action should be clearly labeled with its respective timestamp to avoid bleeding into neighboring segments.
What is the maximum length for a label in Atlas Capture?
Labels in the Atlas Capture process should stay under 20 words and must accurately describe the actions throughout the entire segment. Additionally, the maximum segment duration is 10 seconds, with one second allocated per label. Micro-actions that are under one second should be rounded up to one second. It is recommended to limit segments to 1-3 atomic actions to maintain clarity and focus.
What are forbidden words in Atlas Capture labeling?
In Atlas Capture labeling, certain words are considered forbidden and should be avoided to ensure clarity and accuracy. These include terms like 'adjust,' 'manipulate,' and '-ing' verbs such as 'picking up' or 'placing.' Additionally, pronouns like 'it' or 'them' should not be used. Instead, specific actions should be described using precise verbs attached to the objects involved.
What labeling standards should be followed for dense and coarse actions?
When labeling actions, dense labels should focus on exact actions and objects, including micro-actions, and should be detailed and specific. Coarse labels, on the other hand, should be used only when there are too many micro-actions to label densely and should focus on the main goal or objective. Both types of labels must adhere to the maximum segment duration of 10 seconds and should specify the hands used for the actions.
How should timestamps be edited in labeling segments?
Timestamps in labeling segments should be edited to ensure accuracy. If an action is cut off in the video, the timestamp should be extended to include the full action. Conversely, if there is idle time or if the next action is included within a segment, the timestamp should be shortened. This helps maintain clarity and ensures that each label accurately reflects the actions performed.