# Exemple de conversion d’un document word au format html et markdown¶

Note

Le site généré est visible en suivant l’url ci dessous

https://pnavaro.gitpages.huma-num.fr/test-eost

## Convert Word to Markdown¶

To save images that are included in a binary container (docx, epub, or odt) - here a Microsoft Word document - to a directory use the following command. This will create a folder images/media. The media is extracted from the container and the original filenames are used.

pandoc --extract-media=images -s mydoc.docx -t markdown -o mddoc.md


In Word, images files actually live in a folder called “media” inside the docx. So, the “media” folder will always be created. To have a single directory level with the directory “media” only, use the current directory and this command.

pandoc --extract-media=. -s mydoc.docx -t markdown -o mddoc.md


## Convert Word to HTML¶

To convert a Microsoft Word document to a website, run this command.

pandoc --extract-media=. -s mydoc.docx -t html -c styles.css -o htmldoc.html


To get the desired result, define your styles.css, e.g. as here:

html {
line-height: 1.7;
font-family: sans-serif;
font-size: 20px;
color: #1a1a1a;
background-color: #fdfdfd;
}


All images will be stored in the media directory, as above. A table of contents will be generated as anchors. Headers and footers are skipped. If you have page numbering in places, the pages are not separated, it´s one large document, but you can play around with the many switches.