javappt读取_java读取doc

‘壹’ 如何用java实现在指定窗口内打开PPT

Apache的poi是为java写的解析office文件的库，本身有解析ppt的功能，官方网址是http://poi.apache.org/，http://poi.apache.org/slideshow/是其ppt读取组件。
我大致看了一下，它会把ppt中的文本解析成RichTextRun对象，大概是html格式的富文本，至于图片貌似要另行获取。总的来说能满足你的要求。

‘贰’ java读取用户上传的jpg、pdf、doc、xls、ppt文件，将这些文件的二进制数据存储到数据库，或者文件形式存储

一般文件不适合存储到数据库，最好用文件服务器什么的，简单点可以存到本工程某个目录下
上传一般用form或者用插件比如jquery的uploadify，网上有示例非常简单，action接收到文件后，直接new File（path）到文件存储目录就好了

‘叁’ 如何通过JAVA 读取.wps et及 dps文件格式的内容

下面是三个java例子，关于读取wps/et/dps的方法

1.读取wps（读取文本）: 通过流加载wps文件，读取文字内容

import com.spire.doc.*;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileWriter;
import java.io.IOException;

public class ReadTextFromWPS {
public static void main(String[] args) throws IOException{
//通过流加载WPS文字文档
FileInputStream inputStream = new FileInputStream(new File("test.wps"));
Document doc = new Document();
doc.loadFromStream(inputStream, FileFormat.Doc);

//获取文本保存为String
String text = doc.getText();

//将String写入Txt
writeStringToTxt(text,"读取WPS文本.txt");
}
public static void writeStringToTxt(String content, String txtFileName) throws IOException {

FileWriter fWriter= new FileWriter(txtFileName,true);
try {
fWriter.write(content);
}catch(IOException ex){
ex.printStackTrace();
}finally{
try{
fWriter.flush();
fWriter.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}

2. 读取et：直接加载et格式的表格文件，读取数据

import com.spire.xls.*;

public class ExcelToText {
public static void main(String[] args) {
//加载et格式的表格文件
Workbook workbook = new Workbook();
workbook.loadFromFile("test.et");

//获取工作表
Worksheet sheet = workbook.getWorksheets().get(0);

//获取指定单元格中的文本数据
CellRange range = sheet.getCellRange("A1");
String text = range.getText().trim();
System.out.println(text);
}
}

3.读取dps：直接加载dps格式的幻灯片文档，读取文本

import com.spire.presentation.IAutoShape;
import com.spire.presentation.ISlide;
import com.spire.presentation.ParagraphEx;
import com.spire.presentation.Presentation;
import java.io.FileWriter;

public class ExtractText {
public static void main(String[]args) throws Exception{
//加载测试文档
Presentation ppt = new Presentation();
//ppt.loadFromFile("test.pptx");
ppt.loadFromFile("test.dps");

StringBuilder buffer = new StringBuilder();

//遍历文档中的幻灯片，提取文本
for (Object slide : ppt.getSlides())
{
for (Object shape : ((ISlide) slide).getShapes())
{
if (shape instanceof IAutoShape)
{
for (Object tp : ((IAutoShape) shape).getTextFrame().getParagraphs())
{
buffer.append(((ParagraphEx) tp).getText());
}
}
}
}
//保存到文本文件
FileWriter writer = new FileWriter("ExtractTextfromDPS.txt");
writer.write(buffer.toString());
writer.flush();
writer.close();
}
}

这里须在Java程序中导入spire.office.jar文件。

‘肆’ apache poi获取ppt全部内容和细化读取的区别

有时候我们需要从Excel文件中读取数据,或者我们为了商务或者财政的目的生成Excel格式的报表.Java没有对操作Excel文件提供内在的支持,所以我们需要寻找开源的APIs.当我开始寻找操作Excel的APIs时候,大部分人建议使用JExcel或者Apache POI.

在深入研究后,我发现由于以下主要原因Apache POI是正确的选择.还有些关于高级特性的原因,但是我们不深入太多细节.
1)Apache基金的支持.
2)JExcel不支持xlsx格式而POI既支持xls格式又支持xlsx格式.
3)Apache POI是基于流的处理,因此更适合大文件和要求更少的内存.
Apache POI对处理Excel文件提供了强大的支持,并且能处理xls和xlsx格式的电子表格.

关于Apache POI一些重要的地方:
1)Apache POI包含适合Excel97-2007(.xls文件)的HSSF实现.
2)Apache POI XSSF实现用来处理Excel2007文件(.xlsx).
3)Apache POI HSSF和XSSF提供了读/写/修改Excel表格的机制.
4)Apache POI提供了XSSF的一个扩展SXSSF用来处理非常大的Excel工作单元.SXSSF API需要更少的内存,因此当处理非常大的电子表格同时堆内存又有限时,很合适使用.
5)有两种模式可供选择--事件模式和用户模式.事件模式要求更少的内存,因为用tokens来读取Excel并处理.用户模式更加面向对象并且容易使用,因此在我们的示例中使用用户模式.
6)Apache POI为额外的Excel特性提供了强大支持,例如处理公式,创建单元格样式--颜色,边框,字体,头部,脚部,数据验证,图像,超链接等.

Apache POI的Maven依赖

[java] view plain
<span style="font-size:14px;"><dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>3.10-FINAL</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>3.10-FINAL</version>
</dependency></span>

Apache POI的当前版本是3.10-FINAL.如果你使用单独的java应用,添加jars根据下面的图片.

读取Excel文件
假设我们有一个叫Sample.xlsx的Excel文件,里面有两个sheet并且下面图片中的数据.我们想要读取这个Excel文件并且创建Countries list.sheet1有些额外的数据,当我们解析时会忽略它.

我们的国家(Country)java bean如下:

Country.java

[java] view plain
package com.journaldev.excel.read;

public class Country {

private String name;
private String shortCode;

public Country(String n, String c){
this.name=n;
this.shortCode=c;
}

public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getShortCode() {
return shortCode;
}
public void setShortCode(String shortCode) {
this.shortCode = shortCode;
}

@Override
public String toString(){
return name + "::" + shortCode;
}

}

读取Excel文件并创建Countries list代码如下:

ReadExcelFileToList.java

[java] view plain
package com.journaldev.excel.read;

import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class ReadExcelFileToList {

public static List<Country> readExcelData(String fileName) {
List<Country> countriesList = new ArrayList<Country>();

try {
//Create the input stream from the xlsx/xls file
FileInputStream fis = new FileInputStream(fileName);

//Create Workbook instance for xlsx/xls file input stream
Workbook workbook = null;
if(fileName.toLowerCase().endsWith("xlsx")){
workbook = new XSSFWorkbook(fis);
}else if(fileName.toLowerCase().endsWith("xls")){
workbook = new HSSFWorkbook(fis);
}

//Get the number of sheets in the xlsx file
int numberOfSheets = workbook.getNumberOfSheets();

//loop through each of the sheets
for(int i=0; i < numberOfSheets; i++){

//Get the nth sheet from the workbook
Sheet sheet = workbook.getSheetAt(i);

//every sheet has rows, iterate over them
Iterator<Row> rowIterator = sheet.iterator();
while (rowIterator.hasNext())
{
String name = "";
String shortCode = "";

//Get the row object
Row row = rowIterator.next();

//Every row has columns, get the column iterator and iterate over them
Iterator<Cell> cellIterator = row.cellIterator();

while (cellIterator.hasNext())
{
//Get the Cell object
Cell cell = cellIterator.next();

//check the cell type and process accordingly
switch(cell.getCellType()){
case Cell.CELL_TYPE_STRING:
if(shortCode.equalsIgnoreCase("")){
shortCode = cell.getStringCellValue().trim();
}else if(name.equalsIgnoreCase("")){
//2nd column
name = cell.getStringCellValue().trim();
}else{
//random data, leave it
System.out.println("Random data::"+cell.getStringCellValue());
}
break;
case Cell.CELL_TYPE_NUMERIC:
System.out.println("Random data::"+cell.getNumericCellValue());
}
} //end of cell iterator
Country c = new Country(name, shortCode);
countriesList.add(c);
} //end of rows iterator

} //end of sheets for loop

//close file input stream
fis.close();

} catch (IOException e) {
e.printStackTrace();
}

return countriesList;
}

public static void main(String args[]){
List<Country> list = readExcelData("Sample.xlsx");
System.out.println("Country List\n"+list);
}

}

这个程序很容易明白,主要步骤如下:
1)根据文件类型(.xls与.xlsx)创建Workbook实例,xlsx用XSSFWorkbook,xls用HSSFWorkbook.我们可以基于文件名字使用工厂模式创建一个包装类来创建Workbook实例.
2)使用Workbook getNumberOfSheets()来获取sheet的数量,然后循环解析每一个sheet.使用getSheetAt(int i)方法获取 Sheet实例.
3)获取Row和Cell迭代器来获取每一个Cell对象.Apache POI在这里使用了迭代器模式.
4)使用switch-case根据Cell的类型来处理它.

‘伍’ java读取doc,pdf问题。

PDFBox是一个开源的对pdf文件进行操作的库。 PDFBox-0.7.3.jar加入classpath。同时FontBox1.0.jar加入classpath，否则报错

importjava.io.FileInputStream;
importjava.io.FileNotFoundException;
importjava.io.IOException;

importorg.pdfbox.pdfparser.PDFParser;
importorg.pdfbox.pdmodel.PDDocument;
importorg.pdfbox.util.PDFTextStripper;

publicclassPdfReader{
/**
*.
*.
*2008-2-25
*@parampdfFilePathfilepath
*@returnalltextinthepdffile
*/
(StringpdfFilePath)
{
Stringresult=null;
FileInputStreamis=null;
PDDocumentdocument=null;
try{
is=newFileInputStream(pdfFilePath);
PDFParserparser=newPDFParser(is);
parser.parse();
document=parser.getPDDocument();
PDFTextStripperstripper=newPDFTextStripper();
result=stripper.getText(document);
}catch(FileNotFoundExceptione){
//TODOAuto-generatedcatchblock
e.printStackTrace();
}catch(IOExceptione){
//TODOAuto-generatedcatchblock
e.printStackTrace();
}finally{
if(is!=null){
try{
is.close();
}catch(IOExceptione){
//TODOAuto-generatedcatchblock
e.printStackTrace();
}
}
if(document!=null){
try{
document.close();
}catch(IOExceptione){
//TODOAuto-generatedcatchblock
e.printStackTrace();
}
}
}
returnresult;
}
publicstaticvoidmain(String[]args)
{
Stringstr=PdfReader.getTextFromPDF("C:\Read.pdf");
System.out.println(str);

}
}

代码2：

importjava.io.File;
importjava.io.FileOutputStream;
importjava.io.OutputStreamWriter;
importjava.io.Writer;
importjava.net.MalformedURLException;
importjava.net.URL;
importorg.pdfbox.pdmodel.PDDocument;
importorg.pdfbox.util.PDFTextStripper;
publicclassPDFReader{
publicvoidreadFdf(Stringfile)throwsException{

booleansort=false;

StringpdfFile=file;

StringtextFile=null;

Stringencoding="UTF-8";

intstartPage=1;

intendPage=Integer.MAX_VALUE;

Writeroutput=null;

PDDocumentdocument=null;
try{
try{
//首先当作一个URL来装载文件，如果得到异常再从本地文件系统//去装载文件
URLurl=newURL(pdfFile);
//注意参数已不是以前版本中的URL.而是File。
document=PDDocument.load(pdfFile);
//获取PDF的文件名
StringfileName=url.getFile();
//以原来PDF的名称来命名新产生的txt文件
if(fileName.length()>4){
FileoutputFile=newFile(fileName.substring(0,fileName
.length()-4)
+".txt");
textFile=outputFile.getName();
}
}catch(MalformedURLExceptione){
//如果作为URL装载得到异常则从文件系统装载
//注意参数已不是以前版本中的URL.而是File。
document=PDDocument.load(pdfFile);
if(pdfFile.length()>4){
textFile=pdfFile.substring(0,pdfFile.length()-4)
+".txt";
}
}

output=newOutputStreamWriter(newFileOutputStream(textFile),
encoding);

PDFTextStripperstripper=null;
stripper=newPDFTextStripper();
//设置是否排序
stripper.setSortByPosition(sort);
//设置起始页
stripper.setStartPage(startPage);
//设置结束页
stripper.setEndPage(endPage);
//调用PDFTextStripper的writeText提取并输出文本
stripper.writeText(document,output);
}finally{
if(output!=null){
//关闭输出流
output.close();
}
if(document!=null){
//关闭PDFDocument
document.close();
}
}
}
/**
*@paramargs
*/
publicstaticvoidmain(String[]args){
//TODOAuto-generatedmethodstub
PDFReaderpdfReader=newPDFReader();
try{
//取得E盘下的SpringGuide.pdf的内容
pdfReader.readFdf("C:\Read.pdf");
}catch(Exceptione){
e.printStackTrace();
}
}
}

2、抽取支持中文的pdf文件－xpdf
xpdf是一个开源项目，我们可以调用他的本地方法来实现抽取中文pdf文件。
http://www.java-cn.com/technology/tech_downs/1880_004.zip
补丁包：
http://www.java-cn.com/technology/tech_downs/1880_005.zip
按照readme放好中文的patch，就可以开始写调用本地方法的java程序了。
下面是一个如何调用的例子：

importjava.io.*;
/**
*<p>Title:pdfextraction</p>
*<p>Description:email:[email protected]</p>
*<p>Copyright:MatrixCopyright(c)2003</p>
*<p>Company:Matrix.org.cn</p>
*@authorchris
*@version1.0,
*/


publicclassPdfWin{
publicPdfWin(){
}
publicstaticvoidmain(Stringargs[])throwsException
{
StringPATH_TO_XPDF="C:ProgramFilesxpdfpdftotext.exe";
Stringfilename="c:a.pdf";
String[]cmd=newString[]{PATH_TO_XPDF,"-enc","UTF-8","-q",filename,"-"};
Processp=Runtime.getRuntime().exec(cmd);
BufferedInputStreambis=newBufferedInputStream(p.getInputStream());
InputStreamReaderreader=newInputStreamReader(bis,"UTF-8");
StringWriterout=newStringWriter();
char[]buf=newchar[10000];
intlen;
while((len=reader.read(buf))>=0){
//out.write(buf,0,len);
System.out.println("thelengthis"+len);
}
reader.close();
Stringts=newString(buf);
System.out.println("thestris"+ts);
}
}

‘陆’ java如何读取ppsx文档

要完成如题所述的操作，需要将“ppsx”文件另存为“ppt”文件，方法具体如下：第一步：启动PowerPoint软件，文件-打开，如图，打开“打开”对话框，选取ppsx文件，单击确定按钮。第二步：文件-另存为，如图，打开“另存为”对话框，设置文件类型为“ppt.

‘柒’ jsp读取word，ppt，pdf

把PDF文件写入response流里面就可以了！
方法有很多，这里给个独立又简单的例子：

Java代码
1.package com.zhaipuhong.j2se.pdf;
2.
3.import java.io.IOException;
4.import java.util.Date;
5.
6.import javax.servlet.ServletException;
7.import javax.servlet.http.HttpServlet;
8.import javax.servlet.http.HttpServletRequest;
9.import javax.servlet.http.HttpServletResponse;
10.
11.import com.lowagie.text.Document;
12.import com.lowagie.text.DocumentException;
13.import com.lowagie.text.Paragraph;
14.import com.lowagie.text.pdf.PdfWriter;
15.import com.lowagie.text.pdf.BaseFont;
16.import com.lowagie.text.pdf.PdfPTable;
17.import com.lowagie.text.pdf.PdfPCell;
18.import java.awt.Color;
19.
20.public class PdfServlet extends HttpServlet {
21.
22. private static final long serialVersionUID = -6033026500372479591L;
23.
24. public void doGet (HttpServletRequest request, HttpServletResponse response)
25. throws IOException, ServletException {
26.
27. // step 1 建立文档对象
28. Document document = new Document();
29. try {
30. //设置文档相应类型
31. response.setContentType("application/pdf");
32. PdfWriter.getInstance(document, response.getOutputStream());
33.
34.
35. // step 3 打开文档
36. document.open();
37. //支持中文
38. BaseFont bfChinese = BaseFont.createFont("STSong-Light", "UniGB-UCS2-H", BaseFont.NOT_EMBEDDED);
39. com.lowagie.text.Font FontChinese = new com.lowagie.text.Font(bfChinese, 12, com.lowagie.text.Font.NORMAL);
40. Paragraph pragraph=new Paragraph("你好", FontChinese);
41.
42.
43. // step 4 向文档中添加内容
44. document.add(pragraph);
45. document.add(new Paragraph(" Hello World !"));
46. document.add(new Paragraph("Date 时间"+new Date().toString()));
47. document.add(new Paragraph(new Date().toString()));
48. document.add(new Paragraph(new Date().toString()));
49.
50.
51.
52. PdfPTable table = new PdfPTable(3);
53. PdfPCell cell = new PdfPCell(new Paragraph("header with colspan 3"));
54. cell.setColspan(3);
55. table.addCell(cell);
56. table.addCell("1.1");
57. table.addCell("2.1");
58. table.addCell("3.1");
59. table.addCell("1.2");
60. table.addCell("2.2");
61. table.addCell("3.2");
62. cell = new PdfPCell(new Paragraph("cell test1"));
63. cell.setBorderColor(new Color(255, 0, 0));
64. table.addCell(cell);
65. cell = new PdfPCell(new Paragraph("cell test2"));
66. cell.setColspan(2);
67. cell.setBackgroundColor(new Color(0xC0, 0xC0, 0xC0));
68. table.addCell(cell);
69. document.add(table);
70.
71. }catch(DocumentException de) {
72. de.printStackTrace();
73. System.err.println("document: " + de.getMessage());
74. }
75.
76. // step 5: 关闭文档对象
77. document.close();
78. }
79.
80. //支持中文
81. public Paragraph getChineseString(String chineseString){
82. Paragraph pragraph=null;
83. BaseFont bfChinese = null;
84. try {
85. bfChinese = BaseFont.createFont("STSong-Light", "UniGB-UCS2-H",
86. BaseFont.NOT_EMBEDDED);
87. com.lowagie.text.Font FontChinese = new com.lowagie.text.Font(bfChinese,
88. 12, com.lowagie.text.Font.NORMAL);
89. pragraph = new Paragraph(chineseString, FontChinese);
90. }
91. catch (Exception ex) {
92. ex.printStackTrace();
93. }
94. return pragraph;
95. }
96.}

‘捌’ java 读取pdf, word, excel, ppt文档的内容，下了POI包，但是不知道怎么用，刚学java，求告诉一下怎么办

读取pdf需要下载pdfbox：
http://pdfbox.apache.org/
新建一个Project，然后把POI的src导入到该工程。
【How to create an Eclipse Project 】你可以参考：
http://mail-archives.apache.org/mod_mbox/poi-dev/201204.mbox/%3cCAPt+24QbEryNixQFuPhEsKx16oHcn_h5xEa0x9uMSEVYLe-fPw@mail.gmail.com%3e

‘玖’ java调用dll操作ppt

你这个操作可以简化为复制你这个ppt文件嘛，然后将ppt复制后的文件名称修改了下，不调用外部dll也可以实现吧
给你推荐两种方法：
方法1：使用Java执行cmd命令操作
try {
Runtime.getRuntime().exec("这里写dos命令");
} catch (IOException e) {
e.printStackTrace();
}

复制文件的cmd命令是[ 文件1路径文件2路径]
例如复制c盘上的test.ppt 到 c盘上的test1.ppt
命令： c:\test.ppt c:\test1.ppt
在Java中就是 c:\\test.ppt c:\\test1.ppt或者 c:/test.ppt c:/test1.ppt

方法2：使用Java io复制文件

import java.io.*;
public class CopyAll {
public void Dir(File from, File to) {
if (!to.exists()) {
to.mkdirs();
}
File[] files = from.listFiles();
for (int i = 0; i < files.length; i++) {
File file1 = files[i];
File file2 = new File(to.getPath() + File.separator
+ files[i].getName());
if (!file1.isDirectory()) {
File(file1, file2);
} else {
Dir(file1, file2);
}
}
}
public void File(File src, File dest) {
try {
System.out.println(src.getAbsoluteFile() + " -> "
+ dest.getAbsoluteFile());
FileInputStream in = new FileInputStream(src);
FileOutputStream out = new FileOutputStream(dest);
byte[] buffer = new byte[1024];
while (in.read(buffer) != -1) {
out.write(buffer);
}
out.close();
in.close();
System.out.println("文件拷贝成功");
} catch (Exception e) {
System.out.println("文件拷贝失败");
}
}
public static void main(String[] args) {
CopyAll t = new CopyAll();
t.Dir(new File("原文件路径"), new File("要复制文件路径"));
}
}

哦不好意思，跑题了
Java是可以利用Java的JNI(Java native interface)Java本地接口调用dll的，但是这个dll与一般的dll不同，定义要遵循一些规则，所以Java是不能操作一般的dll。还有就是你得懂C或C++才能写出Java可调用的dll，我也只会操作helloword等简单的dll，还有一般Java操作word、excel、ppt这些文件都有开源项目，你可以到网络 Google上去搜索一下
例如：http://www.javayou.com/diary/1637

导航:首页 > 编程语言 > javappt读取

javappt读取

与javappt读取相关的资料