Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Markup instances as Safe ones #74

Open
playpauseandstop opened this issue Jun 16, 2024 · 5 comments
Open

Support Markup instances as Safe ones #74

playpauseandstop opened this issue Jun 16, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@playpauseandstop
Copy link

playpauseandstop commented Jun 16, 2024

At a moment there is a way for ludic to avoid escaping any content in a string - by wrapping it into ludic.base:Safe as follows,

from ludic import html as lhtml
from ludic.base import Safe

class SomePage(Component[NoChildren, NoAttrs]):
    @override
    def render(self) -> lhtml.body:
        content = ...  # Generate page content 
        return lhtml.body(
           *content,
           lhtml.script(Safe(...), type="text/javascript"),
        )

and it works just fine.

But there is one very popular Python library, MarkupSafe, which provides Markup type and I wonder, whether it possible for ludic to treat those instances as Safe ones and do not attempt to escape them?


I'm asking, cause when I want to render Markdown content wrapped into Markup strings as,

        return PageLayout(
            markdown.markdown(extra["prologue"]),
            *links,
            markdown.markdown(extra["epilogue"]),
            extra_js=(
                lhtml.script(
                    Markup('new AnchorJS().add("h2");'), type="text/javascript"
                ),
            ),
        )

the output, for some reason, looks -->

Screenshot 2024-06-16 at 18 04 32

But when I wrap markdown.markdown(...) calls, which returns Markup instances, into Safe (and change Markup in lhtml.script to Safe),

        return PageLayout(
            Safe(markdown.markdown(extra["prologue"])),
            *links,
            Safe(markdown.markdown(extra["epilogue"])),
            extra_js=(
                lhtml.script(
                    Safe('new AnchorJS().add("h2");'), type="text/javascript"
                ),
            ),
        )

all looks good -->

Screenshot 2024-06-16 at 18 06 14


And I wonder, if it possible for ludic to treat Markup instances (automagically or via specific config) as Safe ones?

@paveldedik
Copy link
Member

paveldedik commented Jun 18, 2024

Hi Igor, thank you for the issue.

I am worried about security issues, it seems that python-markdown had the safe_mode attribute which has been deprecated and now users are advised to use an HTML purifier. See https://stackoverflow.com/a/10474537

I was thinking that maybe it could work like this:

  • creating a special Markdown component (could be part of ludic.catalog.markdown) which would basically be implemented with the Safe keyword.
  • it still is necessary to have a safe or trusted attribute which basically says "the markdown contains custom HTML which cannot be trusted"

Here is a possible implementation:

from typing import override

from markdown import markdown
from ludic.attrs import GlobalAttrs
from ludic.base import ComponentStrict, JavaScript, Safe
from ludic.html import div, srcipt


class MarkdownAttrs(GlobalAttrs, total=False):
    trusted: bool


class Markdown(ComponentStrict[str, GlobalAttrs]):
    """Renders given string using Python-Markdown library."""

    @override
    def render(self) -> div:
        content = self.children[0]
        if not self.attrs.get("trusted", False):
            content = # purify content using a library
        return div(Safe(markdown(content)), **self.attrs)

The usage:

# Not using HTML, so no need to explicitly pass `trusted=True`
Markdown("# Header")

# I trust the input
Markdown("# Header <script>...</script>", trusted=True)

# I don't trust the input
Markdown(user_input)             

Would that make sense?

@paveldedik
Copy link
Member

Sorry Igor, I just noticed the suggestion of the MarkupSafe library. Let me think, that probably allows a bit different approach.

@paveldedik paveldedik added the enhancement New feature or request label Jun 18, 2024
@paveldedik
Copy link
Member

paveldedik commented Jun 18, 2024

I need to properly analyse this use case. The reason is that Ludic is already escaping user input, so for example, this is safe to do in Ludic:

from ludic.html import em

p(f"Hello {em(name)}")

It correctly renders while escaping name. The markupsafe library is basically allowing similar thing.

Now I think it should not be needed to use markupsefe at all and still be safe. But that requires making the following work:

from ludic.html import em
from markdown import markdown

markdown(f"Hello {em(name)}")

Which would not work, of course, because python-markdown doesn't know anything about ludic's elements.

I need to analyse what would be the bast way to allow this - whether it is possible at all. Maybe it will involve usage of markupsafe in the end (by creating a Ludic component, something like Markdown that would incorporate all these libraries)

@playpauseandstop
Copy link
Author

Sorry for not providing full context initially, but my markdown.markdown(...) call from the description has nothing common with mentioned python-markdown library. So let me try once again.


My main request for ludic is: support extra Safe strings, which can have other sources and not only be wrapper as ludic.base:Safe.

Safe string (in my case) passed as children to ludic component as markupsafe:Markup instance, but, if I'll use ludic with Django, it is natural that Django's safe string will be passed as django.utils.safestring:SafeString instances.

And I've run into the issue, which mentioned in the description, when pass following Markup instances to ludic,

@attrs.frozen(kw_only=True)
class Markdown:
    extras: tuple[str, ...] = DEFAULT_MARKDOWN_EXTRAS

    def markdown(self, text: str, /) -> Markup:
        return Markup(markdown2.markdown(text, extras=self.extras).strip())

markdown = Markdown()

class SomePage(Component[NoChildren, SomePageAttrs]):
    @override
    def render(self) -> lhtml.div:
        return lhtml.div(markdown.markdown(...), ...)

So, once again, the focus, from my POV, shouldn't be on python-markdown / markdown2 / any other library, which generates HTML from the text, but rather on:

And by ensuring that ludic somehow can be configured to understand that if such instance passed as children to component it shouldn't be rendered anyhow, by treating it as safe string.


Hope, that is more clear this time 🙏


ps. And yes, at a moment, I've fixed the original issue and my Markdown class knows which wrapper to use after the markdown2.markdown(...) call, so all good at my side and the issue can be moved to discussion TBH

@paveldedik
Copy link
Member

paveldedik commented Jul 10, 2024

Thank you for the clarification. I guess my main problem is I don't know how it should be implemented and I have to think about that, I see the following options:

  1. a global configuration variable of "safe" classes - I don't like that because users need to explicitly register the class there. But perhaps it is still a better option than the next two options.
  2. a contrib module with inherited e.g. "SafeString" class for Django which is recognized by ludic. We'd probably end up with a lot of contrib packages which would only contain one class.
  3. hardcoded list of "safe" classes from various libraries - even though it is possible with try: ... except ImportError it doesn't seem especially nice and creates a bit of coupling, we could end up with a huge list of "safe" classes in Ludic. For the users of Ludic, this is probably the best option as they don't really need to do anything and it will work for them.

Do you see any other options? Which one sound the best for you? I guess I'm considering the first and the third option. Do you know about any libraries which use this approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants