Google
 

2009-08-29

PHP学习与openShop研发及tidy使用

一直以来,我都想好好学习PHP,开发一个在线商店管理系统(暂定名为),可一直都没有太深入地去做这个事情,现在我计划依托opentiss.net,除了想办法提供一些开源CMS等LAMP软件的服务之外,可以尝试研发一个小系统,通过这个小系统验证与加深学习效果,为自己的能力提高及将来的发展打好基础,只通过C/C++谋求一份像样的工作真的有些不太容易,尤其是现在经济形势不明朗的情况下。
研究了一下Mambo系统,现在开始构建大致的框架,首先我选择使用xhtml 1.1来展示web内容,不过发现似乎使用这一标准的网站比较少,比较明显的一个例子就是w3c的官网都是xhtml 1.0,只有其xhtml 1.1标准文档采用的的确是xhtml 1.1。不过还是有个不小的收获,通过查看xhtml 1.1标准文档的源代码,发现了一个可以验证xhtml的工具Tidy。随即我便在SLED 11上安装了tidy,安装时,会自动将依赖库libtidy安装上。tidy的使用也很简单,例如,对用wget命令下载到本地的php网站输出index.html文件进行xhtml验证,只需在终端输入以下命令即可:
opentiss@tiss:~/Documents/tmp> tidy -asxhtml -utf8 < index.html > index.tidy.html
Info: Doctype given is "-//W3C//DTD XHTML 1.1//EN"                             
Info: Document content looks like XHTML 1.1                                    
No warnings or errors were found.                                              


To learn more about HTML Tidy see http://tidy.sourceforge.net
Please send bug reports to html-tidy@w3.org
HTML and CSS specifications are available from http://www.w3.org/
Lobby your company to join W3C, see http://www.w3.org/Consortium
如果输出信息为“No warnings or errors were found.”,就表示您的xhtml文档没有与标准相冲突的问题存在。如果需要了解如何使用这一工具,只需在终端输入以下命令查看帮助即可:
opentiss@tiss:~> tidy -help
tidy [option...] [file...] [option...] [file...]
Utility to clean up and pretty print HTML/XHTML/XML
see http://tidy.sourceforge.net/                   

Options for HTML Tidy for Linux/x86 released on 31 October 2006:

File manipulation
-----------------
 -output <file>, -o  write output to the specified <file>                      
 <file>                                                                        
 -config <file>      set configuration options from the specified <file>       
 -file <file>, -f    write errors to the specified <file>                      
 <file>                                                                        
 -modify, -m         modify the original input files                           

Processing directives
---------------------
 -indent, -i         indent element content                                    
 -wrap <column>, -w  wrap text at the specified <column>. 0 is assumed if      
 <column>            <column> is missing. When this option is omitted, the     
                     default of the configuration option "wrap" applies.       
 -upper, -u          force tags to upper case                                  
 -clean, -c          replace FONT, NOBR and CENTER tags by CSS                 
 -bare, -b           strip out smart quotes and em dashes, etc.                
 -numeric, -n        output numeric rather than named entities                 
 -errors, -e         only show errors                                          
 -quiet, -q          suppress nonessential output                              
 -omit               omit optional end tags                                    
 -xml                specify the input is well formed XML                      
 -asxml, -asxhtml    convert HTML to well formed XHTML                         
 -ashtml             force XHTML to well formed HTML                           
 -access <level>     do additional accessibility checks (<level> = 0, 1, 2, 3).
                     0 is assumed if <level> is missing.                       

Character encodings
-------------------
 -raw                output values above 127 without conversion to entities    
 -ascii              use ISO-8859-1 for input, US-ASCII for output             
 -latin0             use ISO-8859-15 for input, US-ASCII for output            
 -latin1             use ISO-8859-1 for both input and output                  
 -iso2022            use ISO-2022 for both input and output                    
 -utf8               use UTF-8 for both input and output                       
 -mac                use MacRoman for input, US-ASCII for output               
 -win1252            use Windows-1252 for input, US-ASCII for output           
 -ibm858             use IBM-858 (CP850+Euro) for input, US-ASCII for output   
 -utf16le            use UTF-16LE for both input and output                    
 -utf16be            use UTF-16BE for both input and output                    
 -utf16              use UTF-16 for both input and output
 -big5               use Big5 for both input and output
 -shiftjis           use Shift_JIS for both input and output
 -language <lang>    set the two-letter language code <lang> (for future use)

Miscellaneous
-------------
 -version, -v        show the version of Tidy
 -help, -h, -?       list the command line options
 -xml-help           list the command line options in XML format
 -help-config        list all configuration options
 -xml-config         list all configuration options in XML format
 -show-config        list the current configuration settings

Use --blah blarg for any configuration option "blah" with argument "blarg"

Input/Output default to stdin/stdout respectively
Single letter options apart from -f may be combined
as in:  tidy -f errs.txt -imu foo.html
For further info on HTML see http://www.w3.org/MarkUp
使用tidy可以随时验证文档,在访问w3c的The W3C Markup Validation Service比较慢的时候,这就是一个比较便捷的选择了。
使用Eclipse+PDT创建的PHP项目源代码中,xml的文件类型标识会被认为错误:
<?xml version="1.0" encoding="UTF-8"?>
其实,解决的方法也很简单,将其作为PHP文本输出即可:
<?php echo '<?xml version="1.0" encoding="UTF-8"?>'; ?>
这几天JavaEye访问不了了,一直在升级中,也不知识什么原因,期望能够早点恢复正常。

1 条评论:

时间 2009年8月31日 星期一 下午04时04分00秒 CST , Blogger 大懒兔 说...

呵呵,亮亮,别怪我贪心,我想问问,你这套系统可否延伸到酒店的在线销售和管理呢,或者可否增加这一部分功能呢^_^

 

发表评论

<< 主页