Videos, Problems, and HTML pages are the fundamental resource types found in HarvardX content.
The accompanying plot provides counts of various asset types found in HarvardX courses: videos,
problems, html pages, and custom elments. All assets
have textual content that can be extracted (videos have transcripts) and other metadata that
helps contextualize their use in a given course. We use the text and metadata to provide search
capabilities in DART. We extract this information by parsing the edX XML from each course.
Custom elements can represent many resources, e.g., resources for Harvard's annotation tool.