对正则表达式不太熟悉,我需要找到一种方法来解析来自维基百科的项目列表。我使用 Wikipedia 的 api.php 提取了内容,剩下的数据如下所示:
==Formal fallacies==
A [[formal fallacy]] is an error in logic that...
* [[Appeal to probability]] – takes something for granted because...
* [[Argument from fallacy]] – assumes that if an argument ...
* [[Base rate fallacy]] – making a probability judgement...
* [[Conjunction fallacy]] – assumption that an outcome simultaneously...
* [[Masked man fallacy]] – ...
===Propositional fallacies===
* [[Affirming a disjunct]] – concluded that ...
* [[Affirming the consequent]] – the [[antecedent...
* [[Denying the antecedent]] – the [[consequent]] in...
所以,我需要一种方法来提取数据,以便:
- 我们只关注以 * [[ 开头的行
- * [[ ]] 之间的任何内容都是名称
- - 后面的剩余内容是描述