largest change in terms of code size in Mizar in the last ten years
used that for number of projects:
Full HTML for Mizar with a number of explanation capabilities - MHTML project, produced HTML for Mizar since 2005, ca. 30 versions since, hosted at http://lipa.ms.mff.cuni.cz/~urban/xmlmml/
Export for Automated theorem provers (MpTP project, using TpTP language) - inclusion of MML into TPTP and CASC, the MpTPChallenge competition, CASC LTB, Mizar Cross-verification by ATPs
Translation of Mizar to ATP formats and work on strong ATP for Mizar
export to untyped first order logic
done in such a way that large parts of MML can today be independently checked by an ATP-based toolchain
number of experiments with automated reproving of Mizar theorems (reached ca. 60% reprove success for non-arithmetical Mizar theorems, and 99.9% success for Mizar atomic justification steps)
finding new and better proofs using ATPs - feedback for improving MML
ATP competitions
ATP-based explanation of complicated Mizar steps
advanced AI systems trying to simulate mathematicians' thinking in large theories (Malarea - Machine Learner for Automated Reasoning - won last the 2008 CASC MZR category)
recent work for Isabelle (with Lary Paulson, Tobias Nipkow and Jasmin Blanchette) in this direction - Trained Isabelle Proof Advisor (http://lipa.ms.mff.cuni.cz/~urban/isab_adv_demo.html), Malarea improving Larry's Sledgehammer ATP linkup for Isabelle
Experimental Proof Advisor for HOL Light trained on Hales' Jordan
Web, versioning, etc.
Established wiki for Mizar developers in 2002 (TWiki), some people even use it
Put ca. 100 versions of MML into CVS, set up a web access (showing diffs, etc.) and custom syntax highlighting via ViewCVS?, also in 2002, unfortunately not used by the Mizar Library Committee and eventually died
In 2004 wrote the MizarWishlist for the 30 years of Mizar workshop
Suggested to go big by comparing to Wikipedia, open-source Mizar, have clear licensing for MML
Suggested to include non-verified theorems in MML in some way (grey zone) to enable formalization of all math parallelly, not linearly, and to make use of all the good semantic processing features (search, ATP, etc.) for the grey zone
Suggested to try to push semantic annotation (grey zone) of the main results from the informal math articles, hopefully resulting in much better search for math (based on my MoMM experience)
Suggested a wiki for MML where humans cooperate with other tools like ATPs
etc., all of it pretty much ignored
My Recent work on web interface prototype for Mizar (could be called MML Wiki at some point)
Server verifies a Mizar article, the output is viewable, errors not shown yet
The article is immediately fully HTML-ized and linked to the existing library using my Mizar XML to HTML translation with some tweaks
The article is immediately translated to TPTP, and corresponding ATP problems are created for all Mizar theorems and atomic steps in the article (done by MpTP, tweaked to be relatively efficient)
The user then can:
check the Mizar inferences independently by ATPs
if successful, the system shows the exact part of the Mizar machinery (type system, etc.) needed for justifying the step
this can be used to prove by ATP steps which are too hard for Mizar, i.e., the ATPs can provide a proof and the necessary Mizar theorems even when the user has not explained the proof to Mizar yet - proof assistance using ATPs
the system can also render the ATP proof using the IDV derivation viewer, and try to automatically classify the important lemmas in the proof using the AGiNT? system
if a fast proof is not possible, the user is dropped with the problem into the SystemOnTPTP? interface and try to solve the problem with other systems and resources; this can be again used for proof assistance
if the user does not know which Mizar theorems and definitions to use for the proof, he can ask the integrated MizarProofAdvisor for theorem suggestions for that particular conjecture
Current state of development
started in late 2008
very experimental, trying to integrate a number of functionalities that I or others did for Mizar so that they were useful for users
uses Perl for cgi and plumbing
Prolog (SWI) for the ATP problem creation and management of the translated MML (the (MPTP project has ca. 6000 Prolog lines right now)
XSL (xsltproc) for translation of Mizar XML to HTML and to extended TPTP (ca. 14000 lines of XSL, compacted to ca. 6500 lines of XSLTXT)
Overview of plans - pretty much general and applicable to any formal Math Wiki
provide persistence, versioning, and authentication - save users' work on new articles on the server (already done) and make it persistently accessible to them (not done yet) - this is a lot, but not all, of the "wiki" functionality
provide editing access also to the existing MML articles - combined with previous this is the "wiki", this means:
choosing the versioning model
finding solutions to dependency issues
finding suitable cohabitation of verified and unverified math
choosing a suitable storage model
licensing and publication - making all versions downloadable and locally available (git repo?)
choosing a suitable editing model
thoughts on system suitability
opening up the whole thing to external developers and users who want to add some functionality
adding some other search engines; for Mizar certainly MMLQuery, for Isabelle the "find_theorems" utility, Coq (Helm?), ...
note that the linking and rendering issue is already solved for Mizar and perhaps for Coq, I did a crude hack for HOL Light - http://lipa.ms.mff.cuni.cz/~urban/hol_light_html/ ; talked with Makarius about Isabelle's html-ization - Makarius wants it but not yet there
Other optional cool things
Choosing the versioning model
possibilities:
a sea of equal clones (github), all users and all branches/repositories are formally equal
only one strongly guarded current version as in WP
a couple of more and less guarded development branches (projects like Debian, Ubuntu, Linux)
suitable combinations thereof
Choosing a suitable storage model
currently just in filesystem, good for experimenting and debugging, storage is cheap today, used e.g. for TWiki and MMLQuery, bad for complete download of the repository and for truly distributed model
SQL database - the Wikipedia (Mediawiki) solution, things like memory caching taken care of, some search capabilities in the SQL engines (not sure how useful for Math Wiki); also user authentication
XML database - tried with the eXist native XML DB, provides autoindexing, powerful XQuery search language for structured data, easy XSL-based HTML presentation of results; problems - was still quite experimental in 2005, not for heavy-duty usage
Good version system (git, mercurial, ...) - very big advantage is that cloning is cheap in terms of space and time - virtually unlimited versioning and cloning (see github for a large scale deployment example); also user authentication for free, supported diff and merge functionalities, web frontends for version management, easy download and replication
Choosing a suitable editing model
possibilities:
simple editing window (now my default) - for small changes or one-time upload of large pieces
JavaScript? editor - Cezary's ProofWeb?, many js editors readily available today (e.g. for Media Wiki) - how suitable are they for Math Wiki?
linkage (modes) for Emacs, JEdit, Eclipse, and other editors for power users - e.g. a minor Math Wiki editing mode for Emacs with similar functionalities to CVS and Git modes)
Finding solutions to dependency issues
in Mizar to some extent easy, to some extent harder; generally hard and system-specific (just did a bit of it for Isabelle with Jasmin Blanchette - they are building the Mirabelle system for tasks like this)
there is always the "slow default": re-verify all suspicious dependent parts, however we would like to be much more specific
a lot of the Mizar dependence issues solved already for the TPTP export, reproving, a machine learning on proofs, and fast processing
how articles (crudely) depend on articles (easy, done)
how theorems depend on theorems and lemmas (easy, done)
how theorems depend on definitions (done to some extent by hacking Mizar, the rest is either imprecise, or needs more Mizar hacking, or needs ATP-based explanation (this is done and works quite well))
how theorems depend on user-supplied automations (Mizar type system) - done by ATP-based explanation (works quite well, however could be made faster and better by unknown amount of Mizar hacking)
various kinds of dependency events:
change a proof without changing the proved theorem - no problem
change a theorem or definition meaning - dependent theorems become unsafe and need to be re-verified
change symbol's typing - each occurrence of the symbol becomes unsafe and needs to be re-verified
rename symbol - needs to be carried out consistently everywhere
... etc.
Cohabitation of verified and unverified math
unverified math needed for parallel import of all of math, the serial approach slows us down
the unverified formal math at least required to parse - already quite a strong check with many dependencies - lots of work in Mizar
trust-based and social metrics usable for unverified math - e.g. like article grading in WP
if needed, the verified math depending on unverified math can always be made independent of it, by stating the unverified assumptions explicitly
life is dangerous, even formal verified math does not protect against buggy definitions
sometimes we will get buggy unverified theorems, but my experience is that this is not a big danger - Mizar formalization is quite often also planned on the high level first, possibly with bugs, but if the planning is good and the proof works, this is not so hard to correct later
Opening up the whole thing to external developers and users who want to add some functionality
examples of things others might like to plug in as robots and services in WP:
Mizar proof improving utilities running in the background
Latex presentation
Customizable user presentation modes
..., etc.
Thoughts on system suitability
the more expert-only the underlying system, the worse to make many mathematicians involved
two purposes: for current expert users (give them the system they want) and for new people (give them something stupid simple)
do you think higher order and constructive math are the right thing for a newcomer??
do you think tactical proofs are the right thing for a newcomer?
should not the proofs be also readable?
hence my preference for declarative proofs and probably ZF
I am not saying Mizar is perfect - the Mizar "type system" can make things quite difficult to understand, however I don' know anything better yet
some new declarative proof language (improved Isar/ZF?) - this is very dangerous, we should not start by designing a new language
the Math Wiki should be developed suitably generically, so that when systems improve, they can be easily plugged in
Other optional cool things
merging of libraries - providing tagging of definitions and theorems with a common language - perhaps using WP concepts?,
having AI algos suggesting new similar pieces based on already identified similarities (e.g. analogous theorems used in proofs means that the proved things are also likely to be analogous) - lots of interesting AI
providing translations (possibly crude and incomplete, but at least a bit automated) between systems - this is generally hard, but in many cases could be useful (I did it for Mizar and TPTP, plans could be e.g. import of MML to Isabelle/ZF, some basic translation of first-orderized MML to HOL and Isabelle/HOL)
having a common representation language (recent suggestion by Andrea Asperti and Hugo Herbelin)
semi-automated import of natural language math (arXive, WP, PlanetMath?) - the Kohlhase group has been doing some Latex to MathML? autotranslation recently
... many more ...
Conclusion
a very exciting project
formal math is the field where computers can help a lot to humans, interesting AI can be developed
but formal math has not taken-off yet
we know from Wikipedia that if the system is designed well, things can happen
a lot can be done to improve things, success depends on making this a project for not just a couple of university people, but to make volunteers interested