Graduation Year


Document Type




Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Computer Science and Engineering

Major Professor

Yu Sun, Ph.D.

Committee Member

Changhyun Kwon, Ph.D.

Committee Member

Xiaoning Qian, Ph.D.

Committee Member

Paul Rosen, Ph.D.

Committee Member

Sudeep Sarkar, Ph.D.

Committee Member

Yicheng Tu, Ph.D.


Bipartite network, Domestic robots, Graph theory, Knowledge representation


In this dissertation, we discuss our work behind the development of the functional object-oriented network (abbreviated as FOON), a graphical knowledge representation for robotic manipulation and understanding of its own actions and (potentially) the intentions of humans in the household. Based on the theory of affordance, this representation captures manipulations and their effects on actions through the coupling of object and motion nodes as fundamental learning units known as functional units. The activities currently represented in FOON are cooking related, but this representation can be extended to other activities that involve manipulation of objects which result in observable changes of state. Typically, a FOON is created after annotating many demonstrations of how tasks are executed from start to finish and merging them all together to form a universal FOON. A robot programmed to use FOON will be equipped with the knowledge needed to solve manipulation problems, given a target goal as a node in FOON; we show how this procedure known as task tree retrieval can be executed by a robot. To circumvent possible physical limitations of the robot in executing manipulations (from the task tree retrieval procedure) successfully for cooking, we demonstrated how human-robot collaboration can also be used to overcome constraints. Complementary to the universal FOON creation procedure, we also investigated other means of learning concepts through semantic similarity as a solution to learning without the annotation of new demonstration videos. In addition to the retrieval algorithm, we also proposed motion embedding for representation of motions based on mechanical characteristics of said motions. Through this proposed representation, known as the motion taxonomy, we can solve the problem of ambiguity, which is inherent to human language when defining labels for motions or manipulations seen in demonstrations, by representing motions in a binary machine language.