How to remove markdown formatting characters?

Good evening

I am quite sure I read every piece of documentation I could find but maybe I oversaw something, because my request is simple: how can I remove the markdown formatting characters from a text?

E.g. I have the following in my content.txt:

Text:

**New Apples: What You Need to Know**

New apples just arrived! Click on the button below to find out more.

Now I want to prepare the text to be sent out a plain text e-mail, meaning I have to remove the ** characters.

Is there a built in function to do that?

Thanks
Andreas

You can use $field->excerpt(): $field->excerpt() | Kirby CMS

Thank you @texnixe - that works in principle but it removes all new lines too. These should stay.

Do you have an idea how I could accomplish that? Maybe it would be easier to simply remove all markdown formatting strings manually? Not sure

You can take a look what the field method actually does and implement only the parts you need, e.g. leaving out this line that converts the line breaks: kirby/src/Toolkit/Str.php at main · getkirby/kirby · GitHub

1 Like

Thank you Nico @distantnative that is so obvious, of course I did not think of that :+1:

For everyone else who faces the problem: I created what I think is a clean version of removing all markdown formatting (including the actual headings) with this code:

// ADOPTED FROM https://github.com/getkirby/kirby/blob/main/src/Toolkit/Str.php#L481
/**
 * Removes all markdown formatting, HTML and PHP tags from a string
 *
 * @param string $string The string to be shortened
 * @return string The shortened string
 */
function stripFormatting(string $string): string {

    // ensure that opening tags are preceded by a space, so that
    // when tags are skipped we can be sure that words stay separate
    $string = preg_replace([
        '/\*\*(.*?)\*\*/',    // Bold **text**
        '/\*(.*?)\*/',        // Italic *text* or _text_
        '/\_(.*?)\_/',        // Italic _text_
        '/\#(.*?)\n/',        // Headers # Header
        '/\~\~(.*?)\~\~/',    // Strikethrough ~~text~~
        '/\`(.*?)\`/',        // Inline code `code`
        '/\[(.*?)\]\((.*?)\)/', // Links [text](url)
        '/\!\[(.*?)\]\((.*?)\)/', // Images ![alt](url)
        '/\>\s(.*?)\n/',       // Blockquotes > text
        '/\s*\-/',             // Unordered lists - item
        '/\s*\d+\./',          // Ordered lists 1. item
        '/\n{3,}/',            // More than three newlines to two newlines
    ], ['', '', '', '', '', '', '', '', '', '', '', "\n\n"], $string);
    

    // We return plain text, e.g. no HTML or PHP tags
    $string = strip_tags($string);

    // Remove double spaces
    $string = preg_replace('![ ]{2,}!', ' ', $string);
    
    // Lets make sure there are no whitespaces before or after the text
    $string = trim($string);

    return $string;
}

Thanks for ChatGPT to produce this regex for me in 2 seconds which have taken me years.