Monday, March 6, 2017

New chart test suit

It's good to see that LibreOffice project has more and more tests improving the functional quality of the software. In this post, I also write about a new test suit added to the project. At Collabora, we are working on a project about a new chart functionality and as a preparation we decided to add a test suit which covers the chart layouting code better than the tests LO already had. With this we can map the actual state of the chart code to test cases and so make sure it's functionality remains intact.

Different kind of automatic tests in LO

There are different forms of automatic tests in the project. The question was which one is the most effective for testing a bigger part of the code (chart layout) not only a small functionality. One form of testing is when we use CppUnit assertions to compare some properties of the actual test case with the expected values. These tests used to test a very specific test case. For example it can check whether one test document has 2 pages when it is imported into LibreOffice. This kind of tests can lead to code duplication when we test the same thing (e.g. page number) on different test documents. That's why this kind of CppUnit test is not effective when we need to test one bigger part of the software sistematically, which may need to test the same thing (e.g. page number) on different test documents, which documents might be important use cases.

Other tests use an XML dump functionality to dump the test document's layout to an XML file and use this file as a reference with which the test document can be compared later. Adding new test cases to this kind of test suits is easier, needs lesser code change compared to first form of tests. However this kind of tests checks the whole layout of the document in general. So when one of these tests fails, you need to check the XML file, understand it's structure and find out why the pointed difference is there. It's clear that this kind of test failures might be more difficult to understand compared to the first form of tests where the test failures point to a very specific document property not to a reference XML file.

New chart test suit

An important difference between these two forms are the generality level of what it tests. First form tests some specific properties of one specific test case, while second form test the whole layout of the test document. To avoid the issues of these two forms I chose to write a test suit designed to test the software functionality on a middle level of generality.

I added a test suit which contains similar tests as the first form, comparing some specific document properties to extected values using CppUnit assertions, but the expected values are not hard coded, but they are written into and read from a simple structured reference file. It's implemented on a way which makes easy to add a new test document for an existing test case and generate the reference file without extending the test code. On the other hand we get a helpful error message when one test failes, since the test case is more specific than an XML dump test.

Testing chart functionality

After I had the form I spent some time on adding the most common use cases to the test suit. I added test cases for different components of charts (axis, chartwall, legend, grid, data series, etc.) and also for common chart types (columnchart, barchart, piechart, etc.). All test cases contains more subcases, testing functionally distinctable use cases. The sistematic testing of the chart functionality also pointed out some issues of the software:

Future possibilities

Now with the new chart test suit bigger part of the chart functionality is covered with tests, but there are still use cases which are not tested. For example some exotic chart types have no tests yet, like bublechart, netchart or 3D charts. Tested document formats are also limited to LibreOffice native formats (ODS, ODP), but these tests are easy to extend to Microsoft Office file formats too, for compatibility testing. You can find new test suit at chart2/qa/extras/chart2dump/ in case you need to add new test cases.

Wednesday, December 21, 2016

A short time spent on LibreOffice accessibility

LibreOffice and Orca

In the last months I have a short time period fixing accessibility issues mainly on Linux. LibreOffice has a bunch of this kind of issues (fdo#36549). This metabug is about those bugs which makes it difficult for Orca screen reader to make LO usable for visually impaired users. As I see Orca has a few workarounds in it's LO related code to handle these issues (e.g. ignoring false or duplicated events), but some times there is no such solution and we need to add improvements on LO side.

Small fixes

So I did some bug fixing on this area. Most of the bugs were about missing accessibility events, which are needed for Orca to handle events which are visible on the screen and so users should notice these changes. For example when the selection is changed on a Calc sheet (fdo#93825) or when the cursor moves inside a text portion (fdo#99687, fdo#71435). These issues can be frustrating for users who used to get feedback about the effect of their keyboard key pressing.

Fixing these issues needed small code changes, which shows LO accessibility code has a good structure in general, but as the code changes in time, some parts of this code just becomes broken, without maintanance.

Spellcheck dialog

A bit bigger change, I added, was related to the spellcheck dialog (fdo#93430). Spellcheck dialog shows the errors spellcheck algorithms find and shows some options to handle these errors (e.g. suggestions for correction). The problem with this dialog was with the text entry which shows the misspelled word. This text entry contains a small part of the text and highlights the misspelled word with red text color. Orca tried to get this small text part and find out which word is the erroneous one, but LO did not return the right text attributes and so Orca did not have the necessary information to handle this situation.

Now, after fixing this issue text color and also the boundaries of misspelled word are accessible for Orca. Great to see that Orca's developer, Joanmarie Diggs already adapted the code to handle these new information and so reading of this spellcheck dialog will be better in the next versions of the two softwares.

BeLin

I added these accessibility improvements working for IT Foundation for the Visually Impaired. One project of the Foundation is an Ubuntu based operating system for visually impaired users called BeLin ("Beszélő Linux", which is "Speaking Linux" in English). Since it's an Ubuntu based distribution it has LibreOffice as default office suite and uses Orca as screen reader.

Hopefully these change will make more comfortable to use LibreOffice and Orca both on BeLin and on other Linux distributions.

Friday, November 25, 2016

New pivot table function in Calc: Median

After have some work with a pivot table related performance issue, I've got a request to implement a new function for pivot tables. It seemed a useful feature to have and also an easy thing to implement at the first sight.

Pivot table functions

Pivot tables are used to analyse a larger amount of data using different statistic functions for that. Both LibreOffice Calc and Microsoft Office Excel have the same function palette with 11 functions like average, sum, count and so on. These aggregate functions can be used for data fields and for row/column fields. Data fields determines which source field would be summarized and which function would be used for that. Row/column fields don't have a function by default, but with setting one user can calculate subtotals too.

Median

Why median? In psychology research pivot tables can be a useful thing to analyze data of the participants. It depends on the type of the data which functions can be used for aggregation. When we have interval variables we can use average, but for ordinal variables we would need median, which was missing from the function list.

This was the starting point, but I've also found some user posts about missing this feature from MS Excel (see links below). It seems Excel users had to face the very same problem again and again over the last ten years. Well, it has some advantages if a software is open source, I guess.

Good news

In LibreOffice 5.3, median is available for pivot tables:

Thanks

... to my professor, Attila Krajcsi (Department of Cognitive Psychology, Eötvös Loránd University) for supporting the LibreOffice project, with the idea to have this new function and also with some course credits!

Thursday, October 20, 2016

Improve pivot table import performance (tdf#102694)

After a short break I got back to Collabora and I continue working on LibreOffice. This time I handled a performance issue in pivot table's import code, as part of our work for SUSE. In some cases importing an XLSX document containing more pivot tables took such a long time, that user could interpret it as a freeze.

After some testing we found that pivot table grouping is one of the points which makes import slow. We means me and Kohei Yoshida, who is an expert in this area of the code and helped me with understanding it. So grouping was in our focus this time.

Pivot table groups

In case of pivot tables we can make groups for columns of the source data. This groups can be name groups (general groups), number groups (e.g. number ranges) and date groups (e.g. grouped by year, month, day, etc.). Since this groups are related to the source, they rather part of the source than the pivot table layout. This is true both for MS Excel and LO Calc implementation. Effectively this means that when we have more pivot tables using the same source they will have the same groups too. In case of LO Calc this groups are stored in the pivot cache linked with the corresponding source range.

Performance

The performance issue here was that however these kind of pivot tables (having the same source) were linked to each other by the pivot cache, XLSX import ignored that and worked expecting pivot tables are fully independent. Which means that same groups were imported so many times as many tables referenced them. What makes it even slower, this kind of tables are linked to each other on a way that when one table's grouping is changed other tables are also affected.

This difference between internal handling of pivot tables an the XLSX import code came from that XLSX import code was written before groups became part of the pivot cache. So I actually needed to update the import code to follow the changes of pivot table internal implementation. With that, pivot table groups' import time became quite good. For example the test document I uploaded to bugzilla (tdf#102694) took more than 20 minutes before and now it takes less than a half of a minute. This document contains a small data table and 20 simple pivot tables. So it's not something which should take so much time to load and now it doesn't.

Saturday, March 28, 2015

MS Word compatible text highlighting in LibreOffice 5.0

LibreOffice has an old compatibility issue (inherited from OpenOffice) with Microsoft Office related to text highlighting (character background). During document exchange between the two office suites text highlighting is changed on an unexpected way. The root of the problem is that LibreOffice (LO) and Microsoft Office (MSO) has a different design for character backgrounds and so it's ambiguous how to save LO character background to MSO file formats (ambiguous both to users and to developers). Now this issue is solved by allowing the users to specify the behavior of LO export.

Design differences

Microsoft Word has a two-character-background concept. One attribute is called shading which lives on more levels of the document model: table cells, paragraphs and characters. With it we can add any background color to the selected object and we also can use it in a theme. The other one is called highlighting which offers a more limited color selection and is used on the analogy of highlighter pen to call attention to a portion of the document.

LibreOffice Writer, in the other hand, supports only one kind of character background which is called also as highlighting. This character background is closer to Word shading attribute, since it has a wider color selection and it's also a specialization of the general background attribute on the same way as shading in Word. An other similarity is that automatic font color interacts with character background in Writer and with shading in Word on a way to make text always visible (e.g. when background is dark then text color becomes white, when the background is light then text color changes to black). Word highlighting doesn't have this kind of behavior.

About Microsoft Word's concept

After I decided to fix this issue the first questions were: How useful is this two-character-background concept of Microsoft Office? Is this something which LibreOffice should support?
I've got the answer: It's not something that LibreOffice needs.

First of all I can't see a real difference between the usage of shading and highlighting in Word. I mean both have the purpose to make some parts of the text more visible (highlight them). Highlighting can be also used for reviewing a document (mark parts of a document temporarily), but both office suits have a Comments feature which includes an integrated highlighting function, which can be more useful for this purpose.

Additonally this concept also leads to misunderstandings among the users. At the first sight it can't be decided whether a highlighted text portion in a document is formatted actually with highlighting or with shading. This becomes worse because of that the highlighting feature is more accessible in some versions of Word and so it is more known by the users as shading which can make users confused when they get a document with character shading. As I see this was the main issue also in case of LibreOffice, because it exported character background as shading to MSO formats and so some of the users were not able to remove this shading in Word by using "Highlight" toolbar  button.

Summary, as I see the two-character-background concept does not have a real benefit and also causes problems on the user side so it's not worth to have.

Export as highlighting or as shading 

The second question was: How to export LO character background to MSO formats to make the users happy? MSO shading is closer to LO character background in behavior but at the same time highlighting is more accessible in Word and on Writer's user interface character background is called as highlighting. So both attribute can be a good candidate for export.

That's why it seemed the best to have an option to choose between shading and highlighting. I added this option at Tools -> Options -> Load/Save -> Microsoft Office, with which user can specify the behavior of the export. The default became highlighting mainly because in LO character background is called highlighting and so users can expect they will get highlighting in their MSO document and not shading.

A new section is highlighted with a red rectangle on Options dialog
New Microsoft Office compatibility option

Import of Microsoft Word documents

So far I wrote only about the export of LO character background. The next question is how shading and highlighting are imported from MSO documents and how they saved back. I solved this on the way that both attributes are preserved by import so a Word document will have the same appearance in Writer. If this document is saved without modification, then shading and highlighting will be saved back unchanged.

The difference compared to Word becomes visible by editing. It doesn't matter whether a character background was shading or highlighting in the original document it can be edited by Highlighting toolbar button in Writer. From that point - after a specific text range's background is edited inside LibreOffice - MSO shading and highlighting markers will be removed and will be replaced with LO specific character background. So MSO attributes are preserved until the corresponding text range is edited by LibreOffice.

Summary

In the next LibreOffice release users will be able to customize their office suite on one more way by specifying how to export character background to Microsoft Word file formats. I think this is the best we can do here because of the design differences. Well, let's see what users say.

Thanks


Wednesday, August 20, 2014

3D models in Impress (LibreOffice 4.3)

The last LibreOffice release came with 3D model support in Impress. Now you can insert 3D models onto your slides in the open formats of glTF, COLLADA and KMZ. To do that go to Insert->Object->3D Model... in the menu hierarchy and after you selected a file you can see the model on your slide:

Impress with a duck model on the slide
Duck 3D model

Animated models

The inserted 3D models also can define some animations. In this case after insertion you can control the animation using the media toolbar's play/pause/stop buttons. During implementation we lean on the existing movie player feature, that's why 3D animations behave the same as movies. This similarity also exists inside the slideshow which means animations are also started as soon as the actual slide appears and the same custom animations are available to trigger play/pause/stop events.

Impress with a monster on the slide
Animated monster model

 

Hidden features of the model viewer window

This 3D model feature was implemented a bit late in the releasing process that can explain why some features of the viewer window does not appear noticeable on the UI. Hopefully the situation will be better on the next release, but until then let's see what are these features are.
First you should know that the viewer window is active only when the 3D model is "played" (push play button on the media toolbar or start the slideshow), even if the model actually doesn't have any animation to play. While the 3D model object is in stopped state, only a screen shot of the scene is displayed, which makes resizing and positioning easier in edit mode.

Changing camera position

So when the viewer window is active we can change the camera position to have a look at the model from different points of view. By default we are in walkthrough mode where we can handle the camera from a first-person perspective. We can move forward ('W'), backward ('S'), left ('A') and right ('D') using the keyboard and we can turn by click and drag.
You can see some pictures bellow about the walkthrough mode. However you can enter into the maze, but inside the maze navigation becomes hard. The perfect would be if we could walk in the maze like packman does but camera handling still needs some improvements.

Packman maze default view
Closer to the maze
Inside the maze
 
Next to the walkthrough mode there is an other one called orbit mode. Use 'M' key to switch between the two modes. In orbit mode the camera is moved on an orbit around the model, always looking at the model. Using the keyboard we can move the camera on this orbit northward ('W'), southward ('S'), westward ('A') and eastward ('D'), and we also can change the distance between the orbit and the center of the model ('E' and 'Q' keys). Click and drag event also moves the camera on this orbit, but from the user points of view this looks rather like the model is rotated around it's center.
On the next pictures you can see the different views after the model was rotated horizontally by mouse (in other words, camera was moved around the model horizontally):

Duck model, full face

Duck model, profile
Duck model, from behind

 

FPS rendering

We added an FPS (frame per second) rendering possibility to the viewer window, so the rendering performance can be measured easily. By default this feature is disabled. To enable it you can press 'F' key and you will see FPS number at the right-bottom corner of the viewer window.

Monster model with FPS

 

About the file formats

The main format is COLLADA, the other ones are closely related to it.
First KMZ is a zipped file format which can contain 3D models in COLLADA format. So not all KMZ files can be loaded by LibreOffice, it's assumed that the given KMZ format contains only one 3D model. The main advantage of supporting KMZ format is the huge source of free 3D models on the 3D warehouse site. On this site almost all models are available in KMZ format.
The glTF format is a really new format, actually it has only a draft specification. The main purpose of the format is the better performance. It is designed to make loading and rendering of the models faster, with using such structures which are closer to the OpenGL language. It's not an independent file format but rather a runtime form of COLLADA. So in general glTF models are generated from COLLADA files and creating/editing of the 3D models is done in COLLADA format.
Since glTF is desgined to be faster, LibreOffice stores all 3D models in this form both in runtime and in the ODP file. When a KMZ or a COLLADA file is inserted into Impress, the file is converted to glTF and rendering comes just after that.

Some limitations you should aware of

First of all it's good to know that Impress uses OpenGL 3.0 for rendering of these 3D models. If your graphic card doesn't support it then only a question mark will be displayed on the screen, but the model is there, so if you save your presentation and move it to a capable computer then it will appear.
Other limitations come from that glTF is a draft format and collada2gltf tool (used for COLLADA->glTF conversion) is also unstable. So don't surprise if some of the KMZ files downloaded from Warehouse are not rendered well.
By now this feature is available only for Windows and Linux.

Who stands behind the feature

3D model support came alive as a result of the cooperation of Collabora, AMD and MCW.
First of all AMD founded our work and was coming up with new ideas about the feature.
Secondly a developer group from MCW was working on the parser/rendering code of the glTF format. To make our cooperation with MCW more misunderstanding tolerant we set up a wall between LibreOffice code and glTF parser code with defining an API. Later, from that separation born an open source glTF rendering library called libglTF. By now libglTF is still developed closely to LibreOffice.
Last but not least we at Collabora have done the integration into LibreOffice. First Markus Mohrard was working on LibreOffice OpenGL code in general, made it better/more fresh and generalized it allowing to use the same code for all OpenGL based  features: OpenGL transitions, OpenGL charts and glTF models. Jan Holesovsky (Kendy) was the manager of the project, making plans, participating in brain storming and having great ideas. Matus Kukan helped us with integrating COLLADA related libraries into LibreOffice (for COLLADA->glTF conversion).
Finally I implemented embedding of glTF models into Impress by setting together the pieces (plans from Kendy, generic OpenGL pieces from Markus, COLLADA conversion from Matus and glTF rendering from MCW).

Monday, September 30, 2013

GSoC 2013 - Character border

Character border (fdo#35155)

In this summer I participated in Google Summer of Code and implemented border on the character level. So in the next 4.2 version of LibreOffice users will be able to set border around a run of characters.

Two example of character border. Above there is a drop cap letter with double border and blue background. Under there are two words with a blue border around it plus a box-shadow on the top-right side of the border.

The character border can be specified as a direct formatting via Format -> Character -> Borders tab and it also can be the part of a character style (e.g. Format -> Styles and Formatting -> Character Style). One special case of the latter is drop caps character style.

Drop Caps

While I was working with character border and I was testing whether it works in all cases I realized that Drop Caps has some bugs which keep me from adding my new feature. So I have to solve them before I able to keep on my original project:

1. User defined background was shifted upward
Two parts of the picture shows the background painting before the bug fix and after that. Before the default gray background was hanging out from behind the user defined yellow background. The after part shows that after fixing the bug only the user defined yellow background is painted.
Since border is drawn at the edges of the background, it was also drawn at the wrong place.

2. Changing line height
A drop caps with the deafult gray background and a red arrow showing that the first line of the paragraph next to the drop cap has bigger height as it is expected.
Height of the first line was changing unexpectedly when the drop cap letter's original height was bigger than the height of the following characters. Original means the height which it would have if it wasn't a drop cap. By now this original height is ignored during height calculation of the first line.
Since border rendering increase this height with the border width, line height also increased with this width.

3. ODF import loses drop caps character style (fdo#43807)
It also loses the border including by this character style.

Border merge

The implementation of character attributes and the specification of ODF file format make it necessary to add automatic border merge functionality. Border merge means that if two text ranges have equal border (same line, padding and shadow) they will be rendered inside the same border.

Above there is a text with two bordered text range with one Q letter between them. After this Q letter is removed the two border text range are merged and so are rendered within the same border.

Border merge also means background merge, since this is a general concept to draw border at the edges of the background.
When there are letters with different sizes and they have background then this background's height will depend from the height of the text. So letters with different size will have different hight background. But if this letters are merged to one border group then they will have a coherent background.

File formats

List of file formats to which/from which Writer exports/imports character border:
ODT, HTML, DOC, DOCX, RTF