This paper presents work on an artificial anthropomorphic agent with multimodal interaction abilitities. It focuses on the development of a markup language, MURML, that bridges between the planning and the animation tasks in the production of multimodal utterances. This hierarchically structured notation provides flexible means of describing gestures in a form-based way and of explicitly experessing their relations to accompanying speech.