

in case of categorial data as "beeswarm plots" in R), but who knows what the authors have done? You can use techniques such as opacity (or jittering e.g.

(c) In addition to precision, the second layer of complexity is overplotting: the "hidden object" may or may not be invisibly present in a vector graphic. if folks use Excel-smoothed lines only without showing the actually measured points (what I of course do not recommend for real data, just for symbolic simplification). (b) Consider that even the original data before upload/conversion for the journal may contain some drawing-related imprecision, e.g.

You may not be able to recover "hidden" data such as higher harmonics. you may be able to recover numeric data not to the original precision, but this is hopefully neglegible to the imprecision due to statistical sampling. Authors are supposed to proofread and confirm that the data is - at least to on visual level - intact (or correct it - I have had high rank journals messing up with data for no obvious reason). (a) You will have to live with whatever graphic format, conversion, compression and thus imprecision has happened to the data. You can still try to revert to a technological solution, but in most cases I’d just assume the data is invalid and not worth relying on. In the infrequent occasion when it doesn’t work because the authors refuse to share the data, you’ll still learn something useful about how trustworthy their data can be assumed to be (i.e., not at all).
Engauge digitizer increase resolution software#
(Vector graphics may offer the illusion of lossless encoding, but that’s assuming no lossy steps were applied by the human or software at any step along the way - a dangerous assumption to make in practice.) When it works (and I expect it would most of the time), you’ll know with certainty that the data you have is exactly what the authors were working with rather than some approximation recovered by trying to reverse engineer an unknown sequence of human and algorithmic processes to convert the raw data into a figure. To email the authors and ask for the data. This is an amusing programming/hacking challenge, but my guess is that in 90% of cases the best solution lies in the realm of human affairs, and that’s simply Thanks to Martin and Massimo Ortolano, whose contributions inspired some of the remarks. Also, the corresponding author could be contacted, which often won't lead to success for many reasons such as unavailability (of data or author, after some time) or unwillingness. hiding individual graphs) would help, but is time-consuming and does only solve some of the problems.Īt first, it should of course be checked if the original numeric data are available, as required by some journals (unfortunately not in many fields). Using a vector graphics editor for preparation before rastering (e.g.

Hence, rastering involves misinterpretation of data. In complex figures graphs could (i) cover each other up, (ii) overlap themself due to scatter and line thickness, and (iii) have varying sampling rate. The problem goes beyond precision in terms of reading out values (which could be resolved by rastering figures in high resolution and using the aforementioned tools). guaranteeing proper resampling) and journals don't always mess up, figures in appropriate quality should now and then be available. Since often high-quality plotting tools are used (e.g. The achievable accuracy of course depends on the quality of the figure, or more specifically on (i) how the figure was originally produced, and (ii) how it was processed during the publication process. This question goes beyond precision (see further remarks below) and also addresses an efficient and semi-automated workflow. Are there tools around which allow to directly digitize vector paths from figures (similar to the aforementioned methods)? Since publications are usually available in digital form and figures therein are often embedded as vector graphics, a more accurate digitization would be desirable. There are some very useful tools around to digitize such data, such as the web application WebPlotDigitizer, the app Engauge Digitizer or within the software Origin, but to my knowledge they only support raster images. Measurement data in publications is often provided only within figures, while the original data is not available.
