Features & Development

Do we still need separate pdf output or can we print adequate hardcopy from HTML?

Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Michael Gage -
Number of replies: 14

I wanted to move this conversation about pdf output from private email into a group discussion. -- Mike



On Jul 14, 2020, at 10:50 PM, Paul Pearson <pearsonp@hope.edu> wrote:

Hi all,

For accessibility reasons, would you consider dropping pdf hardcopy rendering or finding a way to generate accessible pdfs?  If pdf hardcopy is dropped, it could be replaced by an option to render all problems in a homework set on a single html page (which could then be printed to a pdf file or to paper).  If pdf hardcopy were dropped, it would make some aspects of problem authoring easier (specifically, interactive html features could be used without needing to come up with a way to mirror them in pdf mode).  I'm not an accessibility expert, so please speak up if there's something better or if my approach is wrong.  Also, I don't know how pretext fits into this.

Best regards,
Paul Pearson
-----------------------

Hi Paul,


I think I’ll move this conversation to the forums if that is ok so we can get more input.  Historically the pdf output was there because HTML representation was problematic, to say the least.  MathJax has solved that problem so  I agree it is worth rethinking 
what output options we make available.  As an interim measure I’d suggest experimenting with providing easy production of hardcopy from html along the lines you suggest.  I think one could do this while leaving the pdf output channel in place.  If the hardcopy production from html is adequate then we can faze out the pdf channel.

Comments?

Take care,
Mike

In reply to Michael Gage

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Michael Gage -
Hi Mike,

Sounds good to me. Let's get more input. Because we already have code to generate pdf hardcopy (and LaTeX), pdf hardcopy is probably best deprecated (instead of removed). Maybe we should add a course configuration option to turn pdf hardcopy on / off. My suggestion to drop pdf hardcopy altogether was rash and not well thought out, and deprecating and potentially fazing out is the better option.

Take care,
Paul
In reply to Michael Gage

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Nathan Wallach -
I agree with Paul that PDF generation and meeting the WCAG 2 accessibility requirements for PDF output are a real problem. 

I'm not a real expert on accessibility, but did lots of reading about 2 years ago regarding the issue of creating accessible PDF files from TeX/LaTeX source. As of then - this was a big problem starting to get serious attention from some of the central TeX/LaTeX development people and members of TUG. However, from what I understand (and I have not been following carefully) progress is very slow. I doubt that creating WCAG compliant PDF files automatically is currently possible without very significant amounts of manual effort. Thus, getting WW's automatically generated PDF files to be WCAG complaint is probably currently not feasible.

If that understanding is correct, then institutions which are legally bound to provide only WCAG compliant web services would probably be better off at least disabling PDF generation. Maybe we can add a site+course level control setting to at least disable student access to the "links" to request the PDF generation.

References:

Below is an copy of something I wrote back in 2018:

-------------------------------------------

The TeX User's Group (TUG) and the greater TeX/LaTeX community is well aware of the challenged posed by accessibility standards to the ongoing use of LaTeX.

Dr. Ross Moore recently published a letter with a 5 year plan for the TUG accessibility working group together with academic publishers to gradually improve the accessibility of PDF files generated using (pdf-)LaTeX.

It seems that his plan is focused on cooperation with the publishing community for the first 3 years. (They are more directly on the "front-line" than the authors themselves.) Work on creating documentation to help authors make their own contents is deferred to years 3-4 of the plan, and release of better accessibility packages to the public to year 5. As such, it will be quite a while before regular authors can start to make independent use of the additional features/packages/tools to make their PDF output accessible.

Ross Moore's letter about this is attached; it was downloaded from:

http://web.science.mq.edu.au/~ross/TaggedPDF/PDF-standards-v2.pdf

Note: the very end of the letter Dr. Moore addresses the expected future (PDF-UA/2) requirement that included mathematics be described by use of MathML tagging. He envisions this as something to work on in the more distant future (after completing work on the "text level" accessibility of LaTeX generated PDF files).

Bottom line: there is a long way to go before we can expect PDF material created using LaTeX to be compliant with accessibility standards.

In reply to Nathan Wallach

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Gavin LaRose -
Hi Mike,

I'm sympathetic to the issues of accessibility posed by PDF hardcopies. That said, my experience is that when I actually go to print something, printing from PDF is far more likely to actually create the document that I want or need. Maybe I haven't done the right thing with HTML, but insofar as my observation is generalizable I would be loathe to lose the option of PDF output.

For what it's worth,
Gavin
In reply to Nathan Wallach

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Nathan Wallach -
For anyone who has a need to disable hardcopy generataion for some reason - it turns out that there is already an option to disable the generation of PDF output via a permission setting which can be made system wide, or at the course level in course.conf.

The PDF links still appear but non-authorized users would get a message ""You do not have permission to generate hardcopy". The relevant settings to restrict hardcopy access to professor+ accounts (via course.conf) are:

$permissionLevels{download_hardcopy_format_pdf} = 'professor';
$permissionLevels{download_hardcopy_format_tex} = 'professor';
$permissionLevels{download_hardcopy_multiset} = 'professor';
$permissionLevels{download_hardcopy_multiuser} = 'professor';


See: https://webwork.maa.org/wiki/Permissions

In reply to Nathan Wallach

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Andras Balogh -
This is great! Thank you! The multiset is the one that created memory problems for me.
In reply to Michael Gage

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Nathan Wallach -
I personally like the idea of proving a feature to render all problems in an assignment in a single HTML page. I would recommend that if possible such a page only provide "read access" without the option to submit answers.

However, I worry that it may be a bit challenging to implement this by just rendering many problems in a single simple HTML page. Some problems using clever techniques (ex. CSS based styling which was designed well enough for when a problem is loaded "alone") may not work properly when multiple problems are sequentially loaded in a single simple HTML page. It may be necessary to somehow isolate problems one from another (inside iFrames) to avoid such issues.

If a standard WW course could output a "simplified" HTML page with just a single problem without the standard page framework (headers, menus, etc) or the submission buttons, etc. creating a "master" page containing a sequence of iFrames loading the those simplified pages should be relatively easy.
In reply to Nathan Wallach

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Alex Jordan -

Who are the users that should have an accessible PDF? In some other setting, an instructor is distributing some class assignment as a PDF. Every student should be able to get the information in that PDF, so making it accessible is important.

In the WeBWorK setting, things are different. As long as every student has the option of accessing the HTML, you don't need an accessible PDF. You don't legally need one (because the same content is available through an accessible avenue). And imho (but I'm open to persuasion) you don't morally need an accessible PDF either because all students can still easily access the content.

I can think of three reasons to use PDF hard copies. (Maybe there are more, these are the two that come up in my courses.)

  1. A student prefers having a printed paper copy to read while they do their work. Accessibility of the PDF is not relevant.
  2. In certain situations, like maybe the first day of a class, I print an assignment for each student. I do this if I want some class time to go toward working on an assignment and cannot expect all students to have an internet-connected device while in class. Having an accessible PDF would only help here if there were a student with accessibility needs who had a device, but for some reason that device had no internet connection and I could have somehow already delivered the accessible PDF to their device. This is an almost impossible situation to happen. Why would their device have no internet access? For me it would have to be an unexpected outage, in which case I probably didn't/couldn't deliver an accessible PDF either. I probably shouldn't plan to stick thumb drives into student devices.
  3. A student has bad or nonexistent internet while away from campus. If the student has an accessibility need where a print copy is not helpful, then what? They need something they would access electronically but without internet. This is the best case for an accessible PDF from my understanding. The student would save it to their device while they have an internet connection. An alternative to an accessible PDF (that would be easier to implement) would be an HTML page that loads all the questions from a set. Similar to loading many in the Library Browser or showing all questions in a Gateway. They could save this, including CSS and JS dependencies, before heading off campus.

In reply to Alex Jordan

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Sean Fitzpatrick -

The off-campus access is the one place where I sometimes need the PDF. We have Indigenous students living on-reserve that don't have internet at home. One or two students in a given semester will ask me to make them a hard copy they can work on at home, so they can enter all the answers when they get back to campus.

In reply to Michael Gage

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Tim Alderson -
Hi Mike,

I too can appreciate the issues of accessibility and PDF. The reliability of printed tests, exams, or assignments remaining true to the original, regardless of user experience, OS, and printer, is hard to beat with pdf. I have seen testing centres make a real pigs ear out of math tests provided in formats other than pdf, though those instances were before MathJax. This is where I would be most hesitant to remove the option of PDF output.

Tim.
In reply to Tim Alderson

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Paul Pearson -

Hi all,


A couple more thoughts I had:

1. Probably all of the people on this discussion right now are abled and thus have a skewed perspective.  We should loop in people who provide disability support and people who use disability support services who have more expertise and experience.

2. I remember seeing that PreTeXt recently supported braille.  We should look into this and how to support sensory impaired users via multiple channels (text to speech, braille, etc.).


Thanks!

Paul

In reply to Michael Gage

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Andras Balogh -

Thinking of what can be good or what could go wrong:

PDF hardcopy good: It is important for me and for several instructors I know to access correct answers and solutions for the whole class for manual grading. PDF generation works nicely for this right now. Getting this for students one-by-one from individual HTML pages would be time consuming. 

PDF hardcopy bad: Sometimes students and instructors try to get PDF files for the whole semester's assignments, which has large memory footprint. Having students required to do this one-by-one for assignments (possibly through HTML) would be good.

The idea of getting PDF from the html version of a whole set  reminds me of delivering gateway quizzes with all problems generated at once and rendered on one page. This process also has large memory footprints. My understanding is that that generating and grading parts are the ones requiring significant resources, not the displaying part. I just hope displaying all problems on one page would not result in significant increase in memory requirements.


In reply to Michael Gage

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Joseph Lo -
This printout feature is useful for generating personalized assignments and tests that require students to submit their written work or computer codes.

During COVID-19, students need to write exams at home. What I did was generating tests as pdf and posting them on D2L. This is the safest way because if more than a hundred students access webwork and enter a test at the same time it will surely stall the server. After the test I generated all the solutions on a pdf file for marking and for releasing to students. These pdf files with solutions will be more than 1000 page long. I am not sure if such a long page will work well on HTML. I don't know if there are more convenient ways than having everything on a single document.
In reply to Michael Gage

Re: Do we still need separate pdf output or can we print adequate hardcopy from HTML?

by Danny Glin -

One of the things that I've been starting to use the pdf output for is to use pg to generate questions for in-class written tests.

Having pg generate latex code makes it easy to write an algorithmic problem, and then generate the latex code for several versions to include on a printed exam, including answers and solutions.  I'd hate to lose this.