The 23rd ACM Symposium on Document Engineering

August 22, 2023 to August 25, 2023
Limerick, Ireland


This is the main structure of the DocEng'23 programme:

DocEng’23 Programme

Tuesday 22nd of August. Tutorials Day

8:30 Registration & Networking

9:00 Tutorial 1: Looking Beneath the Surface: The Science and Applications of Eye-Gaze Tracking for Assessing Visual Attention Stefania Cristina
10:30 Coffee Break
11:00 Tutorial 1 (cont.)

12:30 Lunch break

14:00 Tutorial 2: Reviewer #2 must be stopped! Or the art of providing good reviewsi. Alexandra Bonnici, Steven Simske
15:30 Coffee Break
16:00 Tutorial 2 (cont.)
17:30 Tutorials end

18:00 Welcome Reception - Stables Pub (On campus)

Wednesday 23rd of August. Day 1

8:30 Registration & Networking
9:00 Welcome message

9:30 Keynote by Joanne O'Doherty: Making a Difference through Data Visualisation. Chair: Patrick Healy.
10:30 Coffee Break

11:00 Session 1 - Document Modelling, Management & Representation. Chair: Ethan Munson
11:00 Dynamic Topic Modelling with Tensor Decomposition as a Tool to Explore the Legal Precedent Relevance over Time. Fernando Alberto Correia dos Santos Junior, Jose Luiz Nunes, Paulo Henrique Alves and Helio Lopes
11:30 Static Pruning for Multi-representation Dense Retrieval. Antonio Acquavia, Nicola Tonellotto and Craig Macdonald
12:00 Genetic Generative Information Retrieval. Hrishikesh Kulkarni, Zachary Young, Nazli Goharian, Ophir Frieder and Sean MacAvaney
12:15 Improving Zero-Shot Text Matching for Financial Auditing with Large Language Models. Lars Hillebrand, Armin Berger, Tobias Deußer, Tim Dilmaghani, Mohamed Khaled, Bernd Kliem, Rüdiger Loitz, Maren Pielka, David Leonhard, Christian Bauckhage and Rafet Sifa

12:30 Lunch break

14:00 Session 2 - Document Recognition, Summarisation and Inference. Chair: Peter King
14:00 WEATHERGOV+: A Table Recognition and Summarization Dataset to Bridge the Gap Between Document Image Analysis and Natural Language Generation. Amanda Dash, Melissa Cote and Alexandra Branzan Albu
14:30 Automatically Inferring the Document Class of a Scientific Article. Antoine Gauquier and Pierre Senellart
15:00 Character Relationship Mapping in Major Fictional Works Using Text Analysis Methods. Sam Wolyn and Steve Simske
15:15 Label Dependency Learning for Multilabel Text Classification. Haytame Fallah, Emmanuel Bruno, Elisabeth Murisasco and Patrice Bellot
15:30 Coffee Break

16:00 Birds of a Feather (BoF) Intro and Invitation for Ideas. Chair: Charles Nicholas.
16:15 ACM SIGWEB Town Hall meeting with Eelco Herder.
17:00 Day end

19:30 Dinner in Castletroy Park Hotel.

Thursday 24th of August. Day 2

9:00 Welcome & Networking
9:30 Keynote by Gary Moloney: The evolution and growth of engineering documents for consumer engagement. Chair: Steve Simske.
10:30 Coffee Break

11:00 Session 3 - Visual Document Analysis. Chair: Melissa Cote.
11:00 Addressing the gap between current language models and key-term based clustering. Eric M. Cabral, Sima Rezaeipourfarsangi, Maria Cristina F. de Oliveira, Evangelos Milios and Rosane Minghim
11:30 Using YOLO Network for Automatic Processing of Finite Automata Images with Application to Bit-Strings Recognition. Daniela Costa and Carlos Mello
12:00 Layout Analysis of Historic Architectural Program Documents. Amir Hossein Oliaee and Andrew Tripp
12:15 OntG-Bart: Ontology-Infused Clinical Abstractive Summarization. Sajad Sotudeh and Nazli Goharian

12:30 Lunch break. Steering Committee meeting.

13:30 Birds of a Feather
14:30 Poster lightning talks. Chair: Mihai Bilauca.

15:10 Poster session with coffee & interactions on poster papers.

16:30 Trip to Cliffs of Moher (The coach will leave at 16:30 sharp)

Friday 25th of August. Final day.

8:30 Welcome & Networking

9:00 Session 4: Security, Applications & User Experiences. Chair: Sima Rezaeipourfarsangi
09:00 Privacy Lost and Found: An Investigation at Scale of Web Privacy Policy Availability. Mukund Srinath, C. Lee Giles, Shomir Wilson, Soundarya Nurani Sundareswara and Pranav Venkit
09:30 A PDF Malware Detection Method Using Extremely Small Training Sample Size. Ran Liu, Cynthia Matuszek and Charles Nicholas
09:45 Deep-learning for dysgraphia detection in children handwritings. Andrea Gemelli, Simone Marinai, Emanuele Vivoli and Tamara Zappaterra
10:00 A document format for sewing patterns. Charlotte Curtis
10:15 BoF Recap
10:45 Coffee Break

11:15 Session 5 Document Content Analysis. Chair: Charlotte Curtis.
11:15 Synchronous Recognition of Musical Images using Coupled N-Gram Models. Manuel Villarreal and Joan Andreu Sánchez
11:45 Technology-Assisted Review for Spreadsheets and Noisy Text. Tom O'Halloran, Bronagh McManus, Andrew Harbison, Maura Grossman and Gordon Cormack
12:00 Muti-task CTC for Joint Handwriting Recognition and Character Bounding Box Prediction. Curtis Wigington

12:30 DocEng 2024. Curtis Wiginton
12:45 Challenges Report. Rafael Dueire Lins, Steve Simske
13:00 DocEng Book Series. Steve Simske
13:15 Announcement of Best Paper Awards. Steve Bagley
13:30 Closing remarks. Steve Simske

13:45 Lunch break, end of symposium.

Tutorials Overview

We are delighted to announce two very interesting and exciting tutorials that will be presented on the 22nd of August.

Looking Beneath the Surface: The Science and Applications of Eye-Gaze Tracking for Assessing Visual Attention

Duration: 3 hours (half-day)
Speakers: Dr Stefania Cristina
Rationale: The purpose of visual media is to convey information, ideas, concepts and emotions, and for this reason, the effectiveness of visual media can be assessed by how much it captures the attention and engages with its audience.

Eye movement patterns have long been recognised as providing valuable insights into the cognitive processes that underlie attention, learning and memory, and as such they may shed light on how viewers engage with visual media. The process of measuring and analysing the movements of a person’s eyes is called eye-gaze tracking, which is a powerful tool with a broad range of applications, not only in studying how people interact with visual content, but also in domains such as healthcare, driving, gaming, and many others.

Thanks to advancements in technology, modern eye-gaze trackers have evolved into much less intrusive and more comfortable devices than their scary predecessors. This has also worked in their favour in making eye-gaze trackers, whether screen-based, head-mounted, or embedded within VR headsets, more accessible and easy to use. In view of the increasing popularity in using eye-gaze tracking to study attention and engagement, this tutorial aims to explore this technology from different angles, including its development over the years, its technical workings, the metrics that may be used to quantify visual attention, and several application domains.

Download full tutorial 1 briefing

Reviewer #2 must be stopped! Or the art of providing good reviews.

Duration: 3 hours (half-day)
Speakers: Dr Bonnici and Prof. Simske
Rationale: Love it or hate it, the peer review process (whether open, blind, or even double-blind) has become the standard and accepted way of assessing the quality of papers before publication, be it for a conference, journal, or book. Indeed, forming the program committee is an essential part in any conference organisation and a good program committee may well be the differentiator from peer conferences. However, we have all been the recipients of a less than stellar/helpful review: from the snarky ones to the one-liners, these reviews can be demoralising and can give the peer-review process a bad reputation! The scope of this tutorial is then to encourage researchers to become more involved in the peer-review process by joining program committees and encourages good practices to collectively strengthen the quality of the peer-review process.

Download full tutorial 2 briefing

Social Events

We are happy to announce that on the evening of Wednesday the 23rd of August we will have the gala dinner and Irish Traditional music arranged in The Castletroy Park Hotel.

Thursday, the 24th of August afternoon we will invite you to go on a road trip on the amazing Atlantic Coast and experience part of the Wild Atlantic Way!

Feel free to bring a guest with you at DocEng'23! Special guest tickets can be purchased on our registration website for the gala dinner on the 23rd of August ($70/$75) and also for the trip on the 24th of August ($40/$45) for early/late registrations.

On the 24th of August, you will first step inside Limerick City's most iconic landmark: The King John's Castle. The stunning exhibition at King John’s Castle brings to life over 800 years of dramatic local history. The castle itself has a turbluent history dating back to Viking times and has undergone several sieges, battles and triumphs over its long history.

(King John's Castle image courtesy of Discover Limerick DAC)

Our trip will continue to capture the sunset (weather permitting) over the world-wide famous landmark on the Atlantic Ocean coast: The Cliffs of Moher.

One of Ireland’s favourite visitor experiences, the Cliffs of Moher tower over the rugged west Clare coast.

Walk the safe, paved pathways and view the famous Cliffs on Europe’s western frontier and enjoy the spectacular vistas over the Atlantic Ocean and the Aran Islands.

Their natural beauty has inspired artists, musicians, and poets for generations, as well as absorbing scientists and geologists, drawn by the unique landscape in which they sit.

The Cliffs of Moher host major colonies of nesting sea birds and are one of the country’s most important bird-breeding sites. The area has been designated a Special Protection Area (SPA) for Birds.

The Cliffs of Moher, the most famous cliffs in Ireland, will leave you awestruck, creating memories that will stay with you forever.