c# - C# - 将 Markdown 实现为 Word (OpenXML)

Question

我正在尝试实现我自己的 Markdown 版本，以便在 C# 应用程序中创建 Word 文档。对于粗体/斜体/下划线，我将分别使用**// 。我创建了一些解析's 组合的东西，通过提取匹配项并使用以下内容来输出粗体文本：`_**

RunProperties rPr2 = new RunProperties();
rPr2.Append(new Bold() { Val = new OnOffValue(true) });

Run run2 = new Run();
run2.Append(rPr2);
run2.Append(new Text(extractedString));
p.Append(run2);

我的问题是当我开始组合三种不同的格式时，我想我必须权衡所有不同的格式组合并将它们分成单独的运行。粗体运行，粗斜体运行，下划线运行，粗体下划线运行等。我希望我的程序能够处理这样的事情：

**_Lorem ipsum_** (creates bold & underlined run)

`Lorem ipsum` dolor sit amet, **consectetur _adipiscing_ elit**. 
_Praesent `feugiat` velit_ sed tellus convallis, **non `rhoncus** tortor` auctor.

基本上，我希望它处理任何你可以扔给它的样式。但是，如果我以编程方式生成这些运行，我需要在将文本设置为运行之前权衡所有内容，我是否应该使用每种样式的字符索引数组来处理它并将它们合并到一个大的样式列表中（不确定我到底是如何会这样做）？

最后一个问题是这样的东西已经存在了吗？如果是这样，我一直找不到它（markdown to word）。

score 2 · Accepted Answer

我认为您必须通过它们具有的格式将文本分成几部分，并将每个部分以正确的格式添加到文档中。像这里http://msdn.microsoft.com/en-us/library/office/gg278312.aspx。

所以

**non `rhoncus** tortor` 将变为 - "non "{bold}, "rhoncus "{bold,italic}, "tortor"{italic}

我认为这比执行几次运行更容易。您甚至不必解析整个文档。只需在进行时进行解析，并在格式中的每次“更改”后写入 docx。

另一个想法 - 如果您创建的只是简单的文本，而这就是您所需要的，那么生成 openXML 本身可能会更简单。您的数据非常结构化，应该很容易从中创建 XML。

这是一个简单的算法来做我的建议......

// These are the different formattings you have
public enum Formatings
    {
        Bold, Italic, Underline, Undefined
    }

    // This will store the current format
    private Dictionary<Formatings, bool> m_CurrentFormat;

    // This will store which string translates into which format
    private Dictionary<string, Formatings> m_FormatingEncoding;


    public void Init()
    {
        m_CurrentFormat = new Dictionary<Formatings, bool>();
        foreach (Formatings format in Enum.GetValues(typeof(Formatings)))
        {
            m_CurrentFormat.Add(format, false);
        }

        m_FormatingEncoding = new Dictionary<string, Formatings>
                                  {{"**", Formatings.Bold}, {"'", Formatings.Italic}, {"\\", Formatings.Underline}};
    }

    public void ParseFormattedText(string p_text)
    {
        StringBuilder currentWordBuilder = new StringBuilder();
        int currentIndex = 0;

        while (currentIndex < p_text.Length)
        {
            Formatings currentFormatSymbol;
            int shift;
            if (IsFormatSymbol(p_text, currentIndex, out currentFormatSymbol, out shift))
            {   
                // This is the current word you need to insert                 
                string currentWord = currentWordBuilder.ToString();

                // This is the current formatting status --> m_CurrentFormat
                // This is where you can insert your code and add the word you want to the .docx

                currentWordBuilder = new StringBuilder();
                currentIndex += shift;
                m_CurrentFormat[currentFormatSymbol] = !m_CurrentFormat[currentFormatSymbol];

            }

            currentWordBuilder.Append(p_text[currentIndex]);
            currentIndex++;
        }


    }

    // Checks if the current position is the begining of a format symbol
    // if true - p_currentFormatSymbol will be the discovered format delimiter
    // and p_shift will denote it's length
    private bool IsFormatSymbol(string p_text, int p_currentIndex, out Formatings p_currentFormatSymbol, out int p_shift)
    {
        // This is a trivial solution, you can do better if you need
        string substring = p_text.Substring(p_currentIndex, 2);
        foreach (var formatString in m_FormatingEncoding.Keys)
        {
            if (substring.StartsWith(formatString))
            {
                p_shift = formatString.Length;
                p_currentFormatSymbol = m_FormatingEncoding[formatString];
                return true;
            }
        }

        p_shift = -1;
        p_currentFormatSymbol = Formatings.Undefined;
        return false;
    }

c# - C# - 将 Markdown 实现为 Word (OpenXML)

1 回答 1

Related

Reference