Ideas about Math Wiki by Josef Urban

(written in Mizar TWiki smile )

My background in Formal math, and web and tool support for it

Writing formal Mizar articles

in set theory (measurable, Mahlo, and inaccessible cardinals) and order sorted algebra

Developing an Emacs authoring environment for Mizar (MizarMode)

  • ca. 6000 lines of Emacs Lisp,
  • used both by normal users and for education,
  • integrates many other systems and functionalities:
  • MmlQuery, MizarProofAdvisor, MoMM, explanation of overloading, proof skeletons, various Mizar utilities, etc.

Fully XML-ized Mizar processing (MizarXmlImplementation)

Translation of Mizar to ATP formats and work on strong ATP for Mizar

  • export to untyped first order logic
  • done in such a way that large parts of MML can today be independently checked by an ATP-based toolchain
  • number of experiments with automated reproving of Mizar theorems (reached ca. 60% reprove success for non-arithmetical Mizar theorems, and 99.9% success for Mizar atomic justification steps)
  • finding new and better proofs using ATPs - feedback for improving MML
  • ATP competitions
  • ATP-based explanation of complicated Mizar steps
  • advanced AI systems trying to simulate mathematicians' thinking in large theories (Malarea - Machine Learner for Automated Reasoning - won last the 2008 CASC MZR category)
  • recent work for Isabelle (with Lary Paulson, Tobias Nipkow and Jasmin Blanchette) in this direction - Trained Isabelle Proof Advisor (http://lipa.ms.mff.cuni.cz/~urban/isab_adv_demo.html), Malarea improving Larry's Sledgehammer ATP linkup for Isabelle
  • Experimental Proof Advisor for HOL Light trained on Hales' Jordan

Web, versioning, etc.

  • Established wiki for Mizar developers in 2002 (TWiki), some people even use it
  • Put ca. 100 versions of MML into CVS, set up a web access (showing diffs, etc.) and custom syntax highlighting via ViewCVS?, also in 2002, unfortunately not used by the Mizar Library Committee and eventually died
  • In 2004 wrote the MizarWishlist for the 30 years of Mizar workshop
    • Suggested to go big by comparing to Wikipedia, open-source Mizar, have clear licensing for MML
    • Suggested to include non-verified theorems in MML in some way (grey zone) to enable formalization of all math parallelly, not linearly, and to make use of all the good semantic processing features (search, ATP, etc.) for the grey zone
    • Suggested to try to push semantic annotation (grey zone) of the main results from the informal math articles, hopefully resulting in much better search for math (based on my MoMM experience)
    • Suggested a wiki for MML where humans cooperate with other tools like ATPs
    • etc., all of it pretty much ignored

My Recent work on web interface prototype for Mizar (could be called MML Wiki at some point)

http://octopi.mizar.org/~mptp/MizAR.html
  • Server verifies a Mizar article, the output is viewable, errors not shown yet
  • The article is immediately fully HTML-ized and linked to the existing library using my Mizar XML to HTML translation with some tweaks
  • The article is immediately translated to TPTP, and corresponding ATP problems are created for all Mizar theorems and atomic steps in the article (done by MpTP, tweaked to be relatively efficient)
  • The user then can:
    • check the Mizar inferences independently by ATPs
    • if successful, the system shows the exact part of the Mizar machinery (type system, etc.) needed for justifying the step
    • this can be used to prove by ATP steps which are too hard for Mizar, i.e., the ATPs can provide a proof and the necessary Mizar theorems even when the user has not explained the proof to Mizar yet - proof assistance using ATPs
    • the system can also render the ATP proof using the IDV derivation viewer, and try to automatically classify the important lemmas in the proof using the AGiNT? system
    • if a fast proof is not possible, the user is dropped with the problem into the SystemOnTPTP? interface and try to solve the problem with other systems and resources; this can be again used for proof assistance
    • if the user does not know which Mizar theorems and definitions to use for the proof, he can ask the integrated MizarProofAdvisor for theorem suggestions for that particular conjecture

Current state of development

  • started in late 2008
  • very experimental, trying to integrate a number of functionalities that I or others did for Mizar so that they were useful for users
  • uses Perl for cgi and plumbing
  • Prolog (SWI) for the ATP problem creation and management of the translated MML (the (MPTP project has ca. 6000 Prolog lines right now)
  • XSL (xsltproc) for translation of Mizar XML to HTML and to extended TPTP (ca. 14000 lines of XSL, compacted to ca. 6500 lines of XSLTXT)

Overview of plans - pretty much general and applicable to any formal Math Wiki

  • provide persistence, versioning, and authentication - save users' work on new articles on the server (already done) and make it persistently accessible to them (not done yet) - this is a lot, but not all, of the "wiki" functionality
  • provide editing access also to the existing MML articles - combined with previous this is the "wiki", this means:
    • choosing the versioning model
    • finding solutions to dependency issues
    • finding suitable cohabitation of verified and unverified math
  • choosing a suitable storage model
  • licensing and publication - making all versions downloadable and locally available (git repo?)
  • choosing a suitable editing model
  • thoughts on system suitability
  • opening up the whole thing to external developers and users who want to add some functionality
  • adding some other search engines; for Mizar certainly MMLQuery, for Isabelle the "find_theorems" utility, Coq (Helm?), ...
  • note that the linking and rendering issue is already solved for Mizar and perhaps for Coq, I did a crude hack for HOL Light - http://lipa.ms.mff.cuni.cz/~urban/hol_light_html/ ; talked with Makarius about Isabelle's html-ization - Makarius wants it but not yet there
  • Other optional cool things

Choosing the versioning model

possibilities:

  • a sea of equal clones (github), all users and all branches/repositories are formally equal
  • only one strongly guarded current version as in WP
  • a couple of more and less guarded development branches (projects like Debian, Ubuntu, Linux)
  • suitable combinations thereof

Choosing a suitable storage model

  • currently just in filesystem, good for experimenting and debugging, storage is cheap today, used e.g. for TWiki and MMLQuery, bad for complete download of the repository and for truly distributed model
  • compressed filesystem - I have tried FuseCompress? (http://miio.net/wordpress/projects/fusecompress/) - transparent, provides additional storage in comparison to raw filesystem
  • SQL database - the Wikipedia (Mediawiki) solution, things like memory caching taken care of, some search capabilities in the SQL engines (not sure how useful for Math Wiki); also user authentication
  • XML database - tried with the eXist native XML DB, provides autoindexing, powerful XQuery search language for structured data, easy XSL-based HTML presentation of results; problems - was still quite experimental in 2005, not for heavy-duty usage
  • Good version system (git, mercurial, ...) - very big advantage is that cloning is cheap in terms of space and time - virtually unlimited versioning and cloning (see github for a large scale deployment example); also user authentication for free, supported diff and merge functionalities, web frontends for version management, easy download and replication

Choosing a suitable editing model

possibilities:
  • simple editing window (now my default) - for small changes or one-time upload of large pieces
  • JavaScript? editor - Cezary's ProofWeb?, many js editors readily available today (e.g. for Media Wiki) - how suitable are they for Math Wiki?
  • linkage (modes) for Emacs, JEdit, Eclipse, and other editors for power users - e.g. a minor Math Wiki editing mode for Emacs with similar functionalities to CVS and Git modes)

Finding solutions to dependency issues

  • in Mizar to some extent easy, to some extent harder; generally hard and system-specific (just did a bit of it for Isabelle with Jasmin Blanchette - they are building the Mirabelle system for tasks like this)
  • there is always the "slow default": re-verify all suspicious dependent parts, however we would like to be much more specific
  • a lot of the Mizar dependence issues solved already for the TPTP export, reproving, a machine learning on proofs, and fast processing
    • how articles (crudely) depend on articles (easy, done)
    • how theorems depend on theorems and lemmas (easy, done)
    • how theorems depend on definitions (done to some extent by hacking Mizar, the rest is either imprecise, or needs more Mizar hacking, or needs ATP-based explanation (this is done and works quite well))
    • how theorems depend on user-supplied automations (Mizar type system) - done by ATP-based explanation (works quite well, however could be made faster and better by unknown amount of Mizar hacking)
  • various kinds of dependency events:
    • change a proof without changing the proved theorem - no problem
    • change a theorem or definition meaning - dependent theorems become unsafe and need to be re-verified
    • change symbol's typing - each occurrence of the symbol becomes unsafe and needs to be re-verified
    • rename symbol - needs to be carried out consistently everywhere
    • ... etc.

Cohabitation of verified and unverified math

  • unverified math needed for parallel import of all of math, the serial approach slows us down
  • the unverified formal math at least required to parse - already quite a strong check with many dependencies - lots of work in Mizar
  • trust-based and social metrics usable for unverified math - e.g. like article grading in WP
  • if needed, the verified math depending on unverified math can always be made independent of it, by stating the unverified assumptions explicitly
  • life is dangerous, even formal verified math does not protect against buggy definitions
  • sometimes we will get buggy unverified theorems, but my experience is that this is not a big danger - Mizar formalization is quite often also planned on the high level first, possibly with bugs, but if the planning is good and the proof works, this is not so hard to correct later

Opening up the whole thing to external developers and users who want to add some functionality

examples of things others might like to plug in as robots and services in WP:

  • MizarProofAdvisor, MoMM?, MMLQuery
  • Sledgehammer, Nitpick, Quickfind for Isabelle
  • Mirabelle (Isabelle)
  • ATPs finding better proofs (Mizar)
  • Mizar proof improving utilities running in the background
  • Latex presentation
  • Customizable user presentation modes
  • ..., etc.

Thoughts on system suitability

  • the more expert-only the underlying system, the worse to make many mathematicians involved
  • two purposes: for current expert users (give them the system they want) and for new people (give them something stupid simple)
  • do you think higher order and constructive math are the right thing for a newcomer??
  • do you think tactical proofs are the right thing for a newcomer?
  • should not the proofs be also readable?
  • hence my preference for declarative proofs and probably ZF
  • I am not saying Mizar is perfect - the Mizar "type system" can make things quite difficult to understand, however I don' know anything better yet
  • some new declarative proof language (improved Isar/ZF?) - this is very dangerous, we should not start by designing a new language
  • the Math Wiki should be developed suitably generically, so that when systems improve, they can be easily plugged in

Other optional cool things

  • merging of libraries - providing tagging of definitions and theorems with a common language - perhaps using WP concepts?,
  • having AI algos suggesting new similar pieces based on already identified similarities (e.g. analogous theorems used in proofs means that the proved things are also likely to be analogous) - lots of interesting AI
  • providing translations (possibly crude and incomplete, but at least a bit automated) between systems - this is generally hard, but in many cases could be useful (I did it for Mizar and TPTP, plans could be e.g. import of MML to Isabelle/ZF, some basic translation of first-orderized MML to HOL and Isabelle/HOL)
  • having a common representation language (recent suggestion by Andrea Asperti and Hugo Herbelin)
  • semi-automated import of natural language math (arXive, WP, PlanetMath?) - the Kohlhase group has been doing some Latex to MathML? autotranslation recently

  • ... many more ...

Conclusion

  • a very exciting project
  • formal math is the field where computers can help a lot to humans, interesting AI can be developed
  • but formal math has not taken-off yet
  • we know from Wikipedia that if the system is designed well, things can happen
  • a lot can be done to improve things, success depends on making this a project for not just a couple of university people, but to make volunteers interested

Thank you, Questions?

-- JosefUrban - 2009-07-17

Topic revision: r2 - 2009-07-17 - 11:01:56 - JosefUrban

tip TWiki Tip of the Day
Keyword search
Interactive search is a keyword search by default. For example, to search for all topics that contain ... Read on Read more

 
Mizar TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback