419 lines
17 KiB
Org Mode
419 lines
17 KiB
Org Mode
#+OPTIONS: H:3 num:nil toc:t \n:nil @:t ::t |:t ^:t -:t f:t *:t TeX:t LaTeX:t skip:nil d:(HIDE) tags:not-in-toc
|
|
#+TITLE: rorg --- Code evaluation in org-mode, with an emphasis on R
|
|
#+SEQ_TODO: TODO PROPOSED | DONE DROPPED MAYBE
|
|
#+STARTUP: oddeven
|
|
|
|
* Overview
|
|
This project is basically about putting source code into org
|
|
files. This isn't just code to look pretty as a source code example,
|
|
but code to be evaluated. Org files have 3 main export targets: org,
|
|
html and latex. Once we have implemented a smooth bi-directional flow
|
|
of data between org-mode formats (including tables, and maybe lists
|
|
and property values) and source-code blocks, we will be able to use
|
|
org-mode's built in export to publish this data in any org-supported
|
|
format using org-mode as an intermediate format. We have a current
|
|
focus on R code, but we are regarding that more as a working example
|
|
than as a defining feature of the project.
|
|
|
|
The main objectives of this project are...
|
|
|
|
# Lets start with this list and make changes as appropriate. Please
|
|
# try to make changes to this list, rather than starting any new
|
|
# lists.
|
|
|
|
- [[* evaluation of embedded source code][evaluation of embedded source code]]
|
|
- [[* execution on demand and on export][execution on demand and on export]]
|
|
- [[* evaluation of source blocks][evaluation of source blocks]]
|
|
- [[* inline source evaluation][inline source evaluation]]
|
|
- [[* interaction with the source-code's process][interaction with the source-code's process]]
|
|
- [[* output of code evaluation][output of code evaluation]]
|
|
- [[* textual/numeric output][textual/numeric output]]
|
|
- [[* graphical output][graphical output]]
|
|
- [[* file creation][non-graphics file creation]]
|
|
- [[* side effects][side effects]]
|
|
- [[* reference to data and evaluation results][reference to data and evaluation results]]: This could happen in many
|
|
directions
|
|
- [[* reference format][reference format]]
|
|
- [[* source-target pairs][source-target pairs]]
|
|
- [[* source block output from org tables][source block output from org tables]]
|
|
- [[* source block outpt from other source block][source block outpt from other source block]]
|
|
- [[* source block output from org list][source block output from org list]] ?? maybe
|
|
- [[* org table from source block][org table from source block]]
|
|
- [[* org table from org table][org table from org table]]
|
|
- [[* org properties from source block][org properties from source block]]
|
|
- [[* org properties from org table][org properties from org table]]
|
|
- [[* caching of evaluation][caching of evaluation]]
|
|
- [[* export][export]]
|
|
|
|
|
|
* Objectives and Specs
|
|
|
|
** evaluation of embedded source code
|
|
|
|
*** execution on demand and on export
|
|
Let's use an asterisk to indicate content which includes the *result* of code evaluation, rather than the code itself. Clearly
|
|
we have a requirement for the following transformation:
|
|
|
|
org \to org*
|
|
|
|
Let's say this transformation is effected by a function
|
|
`org-eval-buffer'. This transformation is necessary when the
|
|
target format is org (say you want to update the values in an org
|
|
table, or generate a plot and create an org link to it), and it
|
|
can also be used as the first step by which to reach html and
|
|
latex:
|
|
|
|
org \to org* \to html
|
|
|
|
org \to org* \to latex
|
|
|
|
Thus in principle we can reach our 3 target formats with
|
|
`org-eval-buffer', `org-export-as-latex' and `org-export-as-html'.
|
|
|
|
An extra transformation that we might want is
|
|
|
|
org \to latex
|
|
|
|
I.e. export to latex without evaluation of code, in such a way that R
|
|
code can subsequently be evaluated using
|
|
=Sweave(driver=RweaveLatex)=, which is what the R community is
|
|
used to. This would provide a `bail out' avenue where users can
|
|
escape org mode and enter a workflow in which the latex/noweb file
|
|
is treated as source.
|
|
|
|
**** How do we implement `org-eval-buffer'?
|
|
|
|
AIUI The following can all be viewed as implementations of
|
|
org-eval-buffer for R code:
|
|
|
|
***** org-eval-light
|
|
This is the beginnings of a general evaluation mechanism, that
|
|
could evaluate python, ruby, shell, perl, in addition to R.
|
|
The header says it's based on org-eval
|
|
|
|
what is org-eval??
|
|
|
|
org-eval was written by Carsten. It lives in the
|
|
org/contrib/lisp directory because it is too dangerous to
|
|
include in the base. Unlike org-eval-light org-eval evaluates
|
|
all source blocks in an org-file when the file is first opened,
|
|
which could be a security nightmare for example if someone
|
|
emailed you a pernicious file.
|
|
|
|
***** org-R
|
|
This accomplishes org \to org* in elisp by visiting code blocks
|
|
and evaluating code using ESS.
|
|
|
|
***** RweaveOrg
|
|
This accomplishes org \to org* using R via
|
|
|
|
: Sweave("file-with-unevaluated-code.org", driver=RweaveOrg, syntax=SweaveSyntaxOrg)
|
|
|
|
***** org-exp-blocks.el
|
|
Like org-R, this achieves org \to org* in elisp by visiting code
|
|
blocks and using ESS to evaluate R code.
|
|
|
|
*** evaluation of source blocks
|
|
(see [[* Special editing and evaluation of source code][Special editing and evaluation of source code]])
|
|
|
|
*** inline source evaluation
|
|
|
|
** interaction with the source-code's process
|
|
We should settle on a uniform API for sending code and receiving
|
|
output from a source process. Then to add a new language all we need
|
|
to do is implement this API.
|
|
|
|
for related notes see ([[* Interaction with the R process][Interaction with the R process]])
|
|
|
|
** output of code evaluation
|
|
*** textual/numeric output
|
|
We (optionally) incorporate the text output as text in the target
|
|
document
|
|
*** graphical output
|
|
We either link to the graphics or (html/latex) include them
|
|
inline.
|
|
|
|
I would say, if the block is being evaluated interactively then
|
|
lets pop up the image in a new window, and if it is being exported
|
|
then we can just include a link to the file which will be exported
|
|
appropriately by org-mode.
|
|
|
|
*** non-graphics files
|
|
? We link to other file output
|
|
*** side effects
|
|
If we are using a continuous process in (for example an R process
|
|
handled by ESS) then any side effects of the process (for example
|
|
setting values of R variables) will be handled automatically
|
|
|
|
Are there side-effects which need to be considered aside from those
|
|
internal to the source-code evaluation process?
|
|
|
|
** reference to data and evaluation results
|
|
I think this will be very important. I would suggest that since we
|
|
are using lisp we use lists as our medium of exchange. Then all we
|
|
need are functions going converting all of our target formats to and
|
|
from lists. These functions are already provided by for org tables.
|
|
|
|
It would be a boon both to org users and R users to allow org tables
|
|
to be manipulated with the R programming language. Org tables give R
|
|
users an easy way to enter and display data; R gives org users a
|
|
powerful way to perform vector operations, statistical tests, and
|
|
visualization on their tables.
|
|
|
|
This means that we will need to consider unique id's for source
|
|
blocks, as well as for org tables, and for any other data source or
|
|
target.
|
|
|
|
*** Implementations
|
|
**** naive
|
|
Naive implementation would be to use =(org-export-table "tmp.csv")=
|
|
and =(ess-execute "read.csv('tmp.csv')")=.
|
|
**** org-R
|
|
org-R passes data to R from two sources: org tables, or csv
|
|
files. Org tables are first exported to a temporary csv file
|
|
using [[file:existing_tools/org-R.el::defun%20org%20R%20export%20to%20csv%20csv%20file%20options][org-R-export-to-csv]].
|
|
**** org-exp-blocks
|
|
org-exp-blocks uses [[org-interblock-R-command-to-string]] to send
|
|
commands to an R process running in a comint buffer through ESS.
|
|
org-exp-blocks has no support for dumping table data to R process, or
|
|
vice versa.
|
|
|
|
**** RweaveOrg
|
|
NA
|
|
|
|
*** reference format
|
|
This will be tricky, Dan has already come up with a solution for R, I
|
|
need to look more closely at that and we should try to come up with a
|
|
formats for referencing data from source-code in such a way that it
|
|
will be as source-code-language independent as possible.
|
|
|
|
*** source-target pairs
|
|
|
|
The following can be used for special considerations based on
|
|
source-target pairs
|
|
|
|
**** source block output from org tables
|
|
**** source block outpt from other source block
|
|
**** source block output from org list
|
|
**** org table from source block
|
|
**** org table from org table
|
|
**** org properties from source block
|
|
**** org properties from org table
|
|
|
|
** caching of evaluation
|
|
|
|
I'm personally not clear on how this would be implemented, but it does
|
|
seem to be important. I'd be interested to hear how Sweave
|
|
accomplished this. Should it be based on tracking changes in source
|
|
blocks.
|
|
|
|
** export
|
|
once the previous objectives are met export should be fairly simple.
|
|
Basically it will consist of triggering the evaluation of source code
|
|
blocks with the org-export-preprocess-hook.
|
|
|
|
This block export evaluation will be aware of the target format
|
|
through the htmlp and latexp variables, and can then create quoted
|
|
=#+begin_html= and =#+begin_latex= blocks appropriately.
|
|
|
|
|
|
* Notes
|
|
** Special editing and evaluation of source code
|
|
Unfortunately org-mode how two different block types, both useful.
|
|
In developing RweaveOrg, a third was introduced.
|
|
|
|
Eric is leaning towards using the =#+begin_src= blocks, as that is
|
|
really what these blocks contain: source code. Austin believes
|
|
that specifying export options at the beginning of a block is
|
|
useful functionality, to be preserved if possible.
|
|
|
|
Note that upper and lower case are not relevant in block headings.
|
|
|
|
*** block format
|
|
**** PROPOSED block format
|
|
I (Eric) propose that we use the syntax of source code blocks as they
|
|
currently exist in org-mode with the addition of *evaluation*,
|
|
*header-arguments*, *exportation*, *single-line-blocks*, and
|
|
*references-to-table-data*.
|
|
|
|
1) *evaluation*: These blocks can be evaluated through =\C-c\C-c= with
|
|
a slight addition to the code already present and working in
|
|
[[file:existing_tools/org-eval-light.el][org-eval-light.el]]. All we should need to add for R support would
|
|
be an appropriate entry in [[org-eval-light-interpreters]] with a
|
|
corresponding evaluation function. For an example usinga
|
|
org-eval-light see [[* src block evaluation w/org-eval-light]].
|
|
|
|
2) *header-arguments*: These can be implemented along the lines of
|
|
Austin's header arguments in [[file:existing_tools/RweaveOrg/org-sweave.el][org-sweave.el]].
|
|
|
|
3) *exportation*: Should be as similar as possible to that done by
|
|
Sweave, and hopefully can re-use some of the code currently present
|
|
in [[file:existing_tools/exp-blocks/org-exp-blocks.el ][org-exp-blocks.el]].
|
|
|
|
4) *single-line-blocks*: It seems that it is useful to be able to
|
|
place a single line of R code on a line by itself. Should we add
|
|
syntax for this similar to Dan's =#+RR:= lines? I would lean
|
|
towards something here that can be re-used for any type of source
|
|
code in the same manner as the =#+begin_src R= blocks, maybe =#+src_R=? Dan: I'm fine with this, but don't think single-line
|
|
blocks are a priority. My =#+R= lines were something totally
|
|
different: an attempt to have users specify R code implicitly,
|
|
using org-mode option syntax.
|
|
|
|
5) *references-to-table-data*: I get this impression that this is
|
|
vital to the efficient use of R code in an org file, so we should
|
|
come up with a way to reference table data from a single-line-block
|
|
or from an R source-code block. It looks like Dan has already done
|
|
this in [[file:existing_tools/org-R.el][org-R.el]].
|
|
|
|
Syntax
|
|
|
|
Multi-line Block
|
|
: #+begin_src lang header-arguments
|
|
: body
|
|
: #+end
|
|
- lang :: the language of the block (R, shell, elisp, etc...)
|
|
- header-arguments :: a list of optional arguments which control how
|
|
the block is evaluated and exported, and how the results are handled
|
|
- body :: the actual body of the block
|
|
|
|
Single-line Block
|
|
: #+begin_src lang body
|
|
- It's not clear how/if we would include header-arguments into a
|
|
single line block. Suggestions? Can we just leave them out? Dan:
|
|
I'm not too worried about single line blocks to start off
|
|
with. Their main advantage seems to be that they save 2 lines.
|
|
|
|
Include Block
|
|
: #+include_src lang filename header-arguments
|
|
- I think this would be useful, and should be much more work (Dan:
|
|
didn't get the meaning of that last clause!?). That way whole
|
|
external files of source code could be evaluated as if they were an
|
|
inline block. Dan: again I'd say not a massive priority, as I think
|
|
all the languages we have in mind have facilities for doing this
|
|
natively, thus I think the desired effect can often be achieved from
|
|
within a #+begin_src block.
|
|
|
|
What do you think? Does this accomplish everything we want to be able
|
|
to do with embedded R source code blocks?
|
|
|
|
***** src block evaluation w/org-eval-light
|
|
here's an example using org-eval-light.el
|
|
|
|
first load the org-eval-light.el file
|
|
|
|
[[elisp:(load (expand-file-name "org-eval-light.el" (expand-file-name "existing_tools" (file-name-directory buffer-file-name))))]]
|
|
|
|
then press =\C-c\C-c= inside of the following src code snippet. The
|
|
results should appear in a comment immediately following the source
|
|
code block. It shouldn't be too hard to add R support to this
|
|
function through the `org-eval-light-interpreters' variable.
|
|
|
|
(Dan: The following causes error on export to HTML hence spaces inserted at bol)
|
|
|
|
#+begin_src shell
|
|
date
|
|
#+end_src
|
|
|
|
**** Source code blocks
|
|
Org has an extremely useful method of editing source code and
|
|
examples in their native modes. In the case of R code, we want to
|
|
be able to use the full functionality of ESS mode, including
|
|
interactive evaluation of code.
|
|
|
|
Source code blocks look like the following and allow for the
|
|
special editing of code inside of the block through
|
|
`org-edit-special'.
|
|
|
|
#+BEGIN_SRC r
|
|
|
|
,## hit C-c ' within this block to enter a temporary buffer in r-mode.
|
|
|
|
,## while in the temporary buffer, hit C-c C-c on this comment to
|
|
,## evaluate this block
|
|
a <- 3
|
|
a
|
|
|
|
,## hit C-c ' to exit the temporary buffer
|
|
#+END_SRC
|
|
|
|
**** dblocks
|
|
dblocks are useful because org-mode will automatically call
|
|
`org-dblock-write:dblock-type' where dblock-type is the string
|
|
following the =#+BEGIN:= portion of the line.
|
|
|
|
dblocks look like the following and allow for evaluation of the
|
|
code inside of the block by calling =\C-c\C-c= on the header of
|
|
the block.
|
|
|
|
#+BEGIN: dblock-type
|
|
#+END:
|
|
|
|
**** R blocks
|
|
In developing RweaveOrg, Austin created [[file:existing_tools/RweaveOrg/org-sweave.el][org-sweave.el]]. This
|
|
allows for the kind of blocks shown in [[file:existing_tools/RweaveOrg/testing.Rorg][testing.Rorg]]. These blocks
|
|
have the advantage of accepting options to the Sweave preprocessor
|
|
following the #+BEGIN_R declaration.
|
|
|
|
*** block headers/parameters
|
|
regardless of the syntax/format chosen for the source blocks, we will
|
|
need to be able to pass a list of parameters to these blocks. These
|
|
should include (but should certainly not be limited to)
|
|
- label of the block
|
|
- names of file to which graphical/textual/numerical/tabular output
|
|
should be written
|
|
- flags for when/if the block should be evaluated (on export etc...)
|
|
- flags for how the results of the export should be displayed/included
|
|
- flags specific to the language of the source block
|
|
- etc...
|
|
|
|
** Interaction with the R process
|
|
|
|
We should take care to implement this in such a way that all of the
|
|
different components which have to interactive with R including:
|
|
- evaluation of source code blocks
|
|
- automatic evaluation on export
|
|
- evaluation of \R{} snippets
|
|
- evaluation of single source code lines
|
|
- sending/receiving vector data
|
|
|
|
I think we currently have two implementations of interaction with R
|
|
processes; [[file:existing_tools/org-R.el][org-R.el]] and [[file:existing_tools/exp-blocks/org-exp-blocks.el ][org-exp-blocks.el]]. We should be sure to take
|
|
the best of each of these approaches.
|
|
|
|
|
|
* Tasks
|
|
|
|
|
|
* COMMENT Commentary
|
|
I'm seeing this as like commit notes, and a place for less formal
|
|
communication of the goals of our changes.
|
|
|
|
** Eric <2009-02-06 Fri 15:41>
|
|
I think we're getting close to a comprehensive set of objectives
|
|
(although since you two are the real R user's I leave that decision up
|
|
to you). Once we've agreed on a set of objectives and agreed on at
|
|
least to broad strokes of implementation, I think we should start
|
|
listing out and assigning tasks.
|
|
|
|
** Eric <2009-02-09 Mon 14:25>
|
|
I've done a fairly destructive edit of this file. The main goal was
|
|
to enforce a structure on the document that we can use moving forward,
|
|
so that any future objective changes are all made to the main
|
|
objective list.
|
|
|
|
I apologize for removing sections written by other people. I did this
|
|
when they were redundant or it was not clear how to fit them into this
|
|
structure. Rest assured if the previous text wasn't persisted in git
|
|
I would have been much more cautious about removing it.
|
|
|
|
I hope that this outline structure should be able to remain stable
|
|
through the process of fleshing out objectives, and cashing those
|
|
objectives out into tasks. That said, please feel free to make any
|
|
changes that you see fit.
|
|
|
|
|
|
* Buffer Dictionary
|
|
LocalWords: DBlocks dblocks
|