Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline math treatment depends on spaces and other chars #55

Closed
dcervenkov opened this issue Jan 7, 2019 · 8 comments
Closed

Inline math treatment depends on spaces and other chars #55

dcervenkov opened this issue Jan 7, 2019 · 8 comments
Labels
bug Something isn't working closed in dev Issue is closed in dev branch help wanted Extra attention is needed
Milestone

Comments

@dcervenkov
Copy link

When processing the following file

\documentclass{article}
\begin{document}
An example showing problems with inline math like $\theta_b$ and 
$\theta_b\geq3$ but not $\theta_b \geq 3$ or $\theta_b=3$.
\end{document}

I get these errors

* L5C58-L5C67 Possible spelling mistake found. Suggestions: [theta, thetas] 
  (57) [lt:en:MORFOLOGIK_RULE_EN_US] 
  th inline math like $\theta_b$ and $\theta_b\geq3$
                       ^^^^^^^^^^
* L5C73-L5C80 Possible spelling mistake found. Suggestions: [theta, thetas] 
  (70) [lt:en:MORFOLOGIK_RULE_EN_US] 
   like $\theta_b$ and $\theta_b\geq3$ but not $\the
                        ^^^^^^^^

Seems some inline math is not cleaned properly.

@sylvainhalle sylvainhalle added this to the v0.8 milestone Jan 8, 2019
@sylvainhalle sylvainhalle added bug Something isn't working help wanted Extra attention is needed labels Jan 8, 2019
@sylvainhalle
Copy link
Owner

This is caused by the regex that detects inline math, which is currently this:

^\$.*?[^\\\\]\$

It does not detect inline math that contains backslashes. However, if I remove [^\\\\] from the regex, this time it will fail to clean inline math that contains \$, so I am a bit at a loss here.

This is a mild example of #56: I added a rule to interpret simple inline math, and now it looks like I am starting to have to interpret more and more of it...

@matze-dd
Copy link

Hi Silvain,

in a smaller related project, I use the Python-style regex
r'$([^\\$]|\.)*$'.
Something like that should work here, too.

Matthias

@matze-dd
Copy link

Oops, some backslashs vanished. Try again
r'\$([^\\\\$]|\.)*\$'
(In Python, the backslash has to be protected also inside of a set [^...].)

Matthias

@sylvainhalle
Copy link
Owner

Thanks Matthias! I'll try this out.

@matze-dd
Copy link

Sorry, there was again a mistake (even with preview)
r'\$([^\\\\$]|\\.)*\$'
Inside of the math, there should be no \, no $, or the \ should eat the next character.

@PinkieSwirl
Copy link

@sylvainhalle Maybe you could implement a solution that reads in regular expression (one per line) and just give this one as an example and say its not perfect. This way if someone wants to change/add/whatever do with it its possible without additional work for you.

@sylvainhalle
Copy link
Owner

@PinkieSwirl That's an interesting suggestion, but I think for the moment, I'll go with @matze-dd's regex and see if I get more compaints in the next release. ;-)

@sylvainhalle
Copy link
Owner

This was a tricky one. I couldn't get the regex suggested by @matze-dd working in Java. So I used a more brutal approach, and replaced the regex trick by a full-fledged method. Seems to do what we want based on the unit tests I wrote!

@sylvainhalle sylvainhalle added the closed in dev Issue is closed in dev branch label Aug 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working closed in dev Issue is closed in dev branch help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants